Machine Learning: Train/Test

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

Data Science Explorer

Machine Learning: Train/Test 본문

Machine Learning

Machine Learning: Train/Test

grace21110 2023. 11. 18. 22:10

To measure if the model is good or not, we can use a method called Train/Test.

Train/Test

It is for measuring the accuracy of the model, and it is called train/test because you separate the data set into two: training and testing set.

Train the modeal means create the model and test the model means that the accuracy of the model.

80 % for training and 20 % for testing.

Example

Our data set illustrates 100 customers in a shop, and their shopping habits.

import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)

x = numpy.random.normal(1, 1, 100)
y = numpy.random.normal(50, 40, 100) / x

plt.scatter(x, y)
plt.show()

Result

The x axis represents the number of minutes before making a purchase.

The y axis represents the amount of money spent on the purchase.

Split Into Train/Test

The training set should be a random selection of 80 percent of the original data.

The testing set should be the remaining.

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

Display the training set

plt.scatter(train_x, train_y)

plt.show()

Result

Display the testing set

plt.scatter(train_x, train_y)
plt.show()

Result

The sklearn module has a method called r2_score() that will help us fin this relationship.

In this case we would like to measure the relationship between the minutes a customer stays in the shop and how much money they spend.

import numpy
from sklearn.metrics import r2_score
numpy.random.seed(2)

x = numpy.random.normal(1, 1, 100)
y = numpy.random.normal(50, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

r2 = r2_score(train_y, mymodel(train_x))

0.035400891945391755

Bring in the testing set

import numpy
from sklearn.metrics import r2_score
numpy.random.seed(2)

x = numpy.random.normal(1, 1, 100)
y = numpy.random.normal(50, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

r2 = r2_score(test_y, mymodel(test_x))

print(r2)

-3.0082394488149955

Predict Values

Example

How much money will a buying customer spend, if she or he stays in the shop for 5 minutes?

print(mymodel(5))

'Machine Learning' 카테고리의 다른 글

Machine Learning : Polynomial Regression (0)	2023.11.17
Machine Learning: Linear Regression (0)	2023.11.15
Machine Learning: Percentiles (0)	2023.11.14
Machine Learning : Standard Deviation (0)	2023.11.13
Mean Median Mode (0)	2023.11.12

'Machine Learning' Related Articles

Data Science Explorer

Machine Learning: Train/Test 본문

Machine Learning: Train/Test

'Machine Learning' 카테고리의 다른 글

티스토리툴바