일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- line width
- Python
- multiple lines
- matplotlib
- AS
- machine learning
- self parameter
- break
- continue
- Default X points
- iterates
- variables
- data distribution
- PANDAS
- Text Analytics
- pie charts
- matplotlib.pyplot
- For loops
- error
- SQL
- polynomial regression
- Github
- Text mining
- line color
- MySQL
- start exercise
- PROJECT
- train/test
- Else
- __init__
- Today
- Total
목록Text mining (4)
Data Science Explorer

Stemming and Lemmatization are for grammatically or semantically changing archetypes of words. In Stemming, there is a tendency to extract some misspelled root words from the original word by applying general methods or by applying more simplified methods when converting them into circular words. However, Lemmatization finds root words in correct spelling considering grammatical elements such as..

Stop words are common words that are filtered out or removed from text data during the preprocessing phase. Common stop words include articles (e.g., "a," "an," "the"), prepositions (e.g., "in," "on," "at"), and conjunctions (e.g., "and," "but," "or"). Example code: 1) Install nltk pip install nltk 2) Import NLTK and Download Punkt Tokenizer Models (if not already downloaded) import nltk nltk.do..

By following the previous lesson, we have the idea of what Text Mining is. For this writing, we are going to jump into Text Normalization. Text Nomoralization is categorized into 5 in total which are Cleansing, Tokenization, Filtering/Removing Stopword/Correcting spelling, Stemming and Lemmatization. Today, we are going to dive into Text Tokenization. There are two types of tokenization which ar..

Text Mining performs analysis tasks such as business intelligence and predictive analysis by establishing models and extracting information using machine learning, language understanding, and statistics. It is categorized in four different parts: 1) Text Classification (Text Categorization) : Text classification is like teaching a computer to understand and sort text into different categories or..