Pandas: Removing Duplicates

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Data Science Explorer

Pandas: Removing Duplicates 본문

Python

Pandas: Removing Duplicates

grace21110 2023. 10. 30. 20:34

If you want to discover duplicates on the dataset, we can use duplicated() method. It gives you a boolean values (True or False).

Example

print (df.duplicated())

Removing Duplicates

To remove duplicates, use the drop_duplicates() method.

Example

df.drop_duplicates(inplace = True)

Exercises

Explain the following code.

# Count duplicate student names
duplicate_count = df[df.duplicated(subset=["StudentName"], keep=False)].shape[0]

# Display the count
print("Count of duplicate student names:", duplicate_count)

'duplicated()' was used to discover the duplicates.
subset=["StudentName"]: checks for duplicate values in the specified subset in this case, "StudentName".
The keep=False parameter means that all occurrences of duplicates will be marked as True.
.shape[0]: This part retrieves the number of rows in the filtered DataFrame, which corresponds to the count of duplicate student names.

'Python' 카테고리의 다른 글

Matplotlib: Matplotlib Pyplot (0)	2023.11.04
Matplotlib (0)	2023.11.04
Pandas: Fixing Wrong Data (0)	2023.10.29
Pandas: Cleaning Data of Wrong Format (0)	2023.10.29
Pandas: Cleaning Empty Cells (2)	2023.10.29

'Python' Related Articles

Data Science Explorer

Pandas: Removing Duplicates 본문

Pandas: Removing Duplicates

'Python' 카테고리의 다른 글

티스토리툴바