data-cleaning

Meaning to identify and correct errors.

  • Handling missing data
    • Ignore if insignificant
    • Fill with a global constant (such as “Unknown”, “N/A”, etc.)
    • Fill with mean or median
    • Fill with most probably value taken from similar data points (using decision trees or Bayesian methods)
  • Smoothing noisy data
    • binning
    • regression
    • clustering

Status: #idea
Tags: data-miningData Mining* [x] data-mining-uts-quiz knowdledge discovery in databases data-warehousing schema 400 400 400 Apriori Algorithm 400 Step 1: Count Distinct Items 400 400 400 400 Step 2: Identify Association Rules 400 400 400 400 FP Growth Algorithm Step 1: Count Distinct Items 400 Step 2: Rearrange Items based count in descending order 400 Step 3: Make FP Growth Tree 1. Make Null Root Node 1. And make children sequentially 400 400 400 400 400 400 400 400 400 400, kddkdddata-prepartion data mining pattern-evaluation knowledge-presentation Status: #idea Tags: data-mining References, data-prepartiondata-prepartionproperly preparing the data to ensure that it is clean, consistent, and ready for analysis. data-cleaning data-integration data-transformation data-reduction Status: #idea Tags: data-mining, kdd References


References