← Back to challenges

CSV Data Aggregation

2 junior
Data Processing CSV Functions

Process a CSV file containing sales data and calculate aggregated statistics. Learn data transformation, grouping operations, and statistical calculations.

Your Challenge

Core Functionality

Your workflow must:

  • Read data from a CSV file (minimum 50 rows with columns: date, product, category, quantity, price)
  • Parse CSV data into structured JSON objects
  • Group sales data by category
  • Calculate total revenue per category (quantity × price)
  • Calculate average order value per category
  • Count number of transactions per category
  • Output results in a structured format (JSON or new CSV)

Error Handling

Your workflow should:

  • Handle missing or invalid data in CSV rows (empty fields, non-numeric values)
  • Validate CSV file structure before processing
  • Provide meaningful error messages if file is malformed
  • Skip rows with invalid data rather than crashing the workflow

Edge Cases

Your workflow should handle:

  • Handle CSV files with different delimiters (comma vs semicolon)
  • Process CSV files with headers in different cases (lowercase/uppercase)
  • Deal with empty CSV files appropriately
  • Handle very large CSV files without memory issues
  • Process negative quantities or prices correctly

Bonus Challenges (Optional)

Take it further by:

  • Add time-based grouping (daily, weekly, monthly aggregations)
  • Calculate additional metrics: median, standard deviation, min/max
  • Generate a summary report with top-performing categories
  • Create visualizable output format (ready for charting)
  • Support multiple CSV files and merge results

Tips & Hints

  • Use the Read Binary File node to load CSV data
  • The Function node allows JavaScript for complex calculations
  • Consider using the Aggregate node for grouping operations
  • The Split Out node can help separate grouped data
  • Test with a small sample CSV first before processing large files