0 Members and 1 Guest are viewing this topic.
DataprooferA proofreader for your data. Currently in beta.Every day, more and more data is created. Journalists, analysts, and data visualizers turn that data into stories and insights.But before you can make use of any data, you need to know if it’s reliable. Is it weird? Is it clean? Can I use it to write or make a viz?This used to be a long manual process, using valuable time and introducing the possibility for human error. People can’t always spot every mistake every time, no matter how hard they try.Data proofer is built to automate this process of checking a dataset for errors or potential mistakes.Getting Started (Desktop)Download a .zip of the latest release from the Dataproofer releases page.Drag the app into your applications folder.Select your dataset, which can be either a CSV on your computer, or a Google Sheet that you’ve published to the web.Once you select your dataset, you can choose which suites and tests run by turning them on or off.Proof your data, get your results, and feel confident about your dataset.Test SuitesInformation & DiagnosticsA set of tests that infer descriptive information based on the contents of a table's cells. Check for numeric values in columns Check for strings in columnsCore SuiteA set of tests related to common problems and data checks — namely, making sure data has not been truncated by looking for specific cut-off indicators. Check for duplicate rows Check for empty columns (no values) Check for special, non-typical Latin characters/letters in strings Check for big integer cut-offs as defined by MySQL and PostgreSQL, common database programs Check for integer cut-offs as defined by MySQL and PostgreSQL, common database programs Check for small integer cut-offs as defined by MySQL and PostgreSQL, common database programs Check for whether there are exactly 65k rows — an indication there may be missing rows lost when the data was exported from a database Check for strings that are exactly 255 characters — an indication there may be missing data lost when the data was exported from MySQLGeo SuiteA set of tests related to common geographic data problems. Check for invalid latitude and longitude values (values outside the range of -180º to 180º) Check for void latitude and longitude values (values at 0º,0º)Stats SuiteA set of test related to common statistical used to detect outlying data. Check for outliers within a column relative to the column's median Check for outliers within a column relative to the column's mean