'machine learning' 태그의 글 목록

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

To measure if the model is good or not, we can use a method called Train/Test. Train/Test It is for measuring the accuracy of the model, and it is called train/test because you separate the data set into two: training and testing set. Train the modeal means create the model and test the model means that the accuracy of the model. 80 % for training and 20 % for testing. Example Our data set illus..

Machine Learning 2023. 11. 18. 22:10

Machine Learning : Polynomial Regression

Polynomial Regression It uses relationship between the variables x and y to find the best way to draw a line through the data points. import numpy import matplotlib.pyplot as plt x = [1,2,3,5,6,7,8,9,10,12,13,14,15,16,18,19,21,22] y = [100,90,80,60,60,55,60,65,70,70,75,76,78,79,90,99,99,100] mymodel = numpy.poly1d(numpy.polyfit(x, y, 3)) myline = numpy.linspace(1, 22, 100) plt.scatter(x, y) plt...

Machine Learning 2023. 11. 17. 18:59

How to duplicate each row by Python

I was working on data cleaning and faced a struggle today. I had to duplicate each row and put it together and did not know how to deal with this. It took me a few hours to think and look up the ways to figure it out. Luckily I found the way out so I am going to show you how I did it! Step #1: You have to import the data. import pandas as pd ss = pd.read_csv('/content/총물량데이터.csv') print (ss) Ste..

Projects 2023. 11. 16. 21:03

Machine Learning: Linear Regression

Linear Regression It uses the relationship between the data-points to draw a straight line through all them. Example import matplotlib.pyplot as plt from scipy import stats x = [5,7,8,7,2,17,2,9,4,11,12,9,6] y = [99,86,87,88,111,86,103,87,94,78,77,85,86] slope, intercept, r, p, std_err = stats.linregress(x, y) def myfunc(x): return slope * x + intercept mymodel = list(map(myfunc, x)) plt.scatter..

Machine Learning 2023. 11. 15. 21:00

Machine Learning: Percentiles

Percentiles Percentiles are used in statistics to give you a number that describes the value that a given percent of the values are lower than. Example Use the Numpy percentile () method to find the percentiles. import numpy ages = [5,31,43,48,50,41,7,11,15,39,80,82,32,2,8,6,25,36,27,61,31] x = numpy.percentile(ages, 75) print(x) What is the 75. percentile? The answer is 43, meaning that 75 perc..

Machine Learning 2023. 11. 14. 18:28

Machine Learning : Standard Deviation

Standard Deviation It is a number that describes how spread out the values are. A low standard deviation means that most of the numbers are close to the mean(average). A high standard deviation means that values are spread out over a wider range. You can use std() to get a standard deviation value. Example import numpy speed = [11, 20, 582, 12] x = numpy.std(speed) print(x) 245.83162428784462 Va..

Machine Learning 2023. 11. 13. 20:58

Mean Median Mode

ModeResult(mode=11, count=1) In machine learning, there are three values. Mean: the average value Median: the mid point value Mode: the most common value Mean It is the average value, divide the sum by the number of the values Example (99+86+87+88+111+86+103+87+94+78+77+85+86) / 13 = 89.77 import numpy speed =[99,86,87,88,111,86,103,87,94,78,77,85,86] x = numpy.mean(speed) print(x) 89.7692307692..

Machine Learning 2023. 11. 12. 11:38

Text Mining #4 Stemming and Lemmatization

Stemming and Lemmatization are for grammatically or semantically changing archetypes of words. In Stemming, there is a tendency to extract some misspelled root words from the original word by applying general methods or by applying more simplified methods when converting them into circular words. However, Lemmatization finds root words in correct spelling considering grammatical elements such as..

Machine Learning 2023. 9. 6. 08:00

Text Mining #3 Removing Stop Word

Stop words are common words that are filtered out or removed from text data during the preprocessing phase. Common stop words include articles (e.g., "a," "an," "the"), prepositions (e.g., "in," "on," "at"), and conjunctions (e.g., "and," "but," "or"). Example code: 1) Install nltk pip install nltk 2) Import NLTK and Download Punkt Tokenizer Models (if not already downloaded) import nltk nltk.do..

Machine Learning 2023. 9. 5. 09:27

Text Mining #2 Text Normalization

By following the previous lesson, we have the idea of what Text Mining is. For this writing, we are going to jump into Text Normalization. Text Nomoralization is categorized into 5 in total which are Cleansing, Tokenization, Filtering/Removing Stopword/Correcting spelling, Stemming and Lemmatization. Today, we are going to dive into Text Tokenization. There are two types of tokenization which ar..

Machine Learning 2023. 9. 5. 09:00

Data Science Explorer

목록machine learning (11)

Data Science Explorer

티스토리툴바