data-cleaning

Meaning to identify and correct errors.

  • Handling missing data
    • Ignore if insignificant
    • Fill with a global constant (such as “Unknown”, “N/A”, etc.)
    • Fill with mean or median
    • Fill with most probably value taken from similar data points (using decision trees or Bayesian methods)
  • Smoothing noisy data
    • binning
    • regression
    • clustering

Status: #idea
Tags: data-miningData Mining* [x] data-mining-uts-quiz knowdledge discovery in databases data-warehousing schema Apriori Algorithm Step 1: Count Distinct Items Step 2: Identify Association Rules FP Growth Algorithm Step 1: Count Distinct Items Step 2: Rearrange Items based count in descending order Step 3: Make FP Growth Tree 1. Make Null Root Node 1. And make children sequentially Step 4: Make Table |Ending with|Paths|Count of each item in path|Candidate itemset with count|Frequent itemset| |-----------, kddkdddata-prepartion data mining pattern-evaluation knowledge-presentation Status: #idea Tags: data-mining References, data-prepartiondata-prepartionproperly preparing the data to ensure that it is clean, consistent, and ready for analysis. data-cleaning data-integration data-transformation data-reduction Status: #idea Tags: data-mining, kdd References


References