All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog ,
and this project adheres to Semantic Versioning .
State Jurisdiction Parsing : The stateJurisdiction field is now automatically parsed into separate fields for better data organization and filtering:
state : State name extracted from stateJurisdiction
division : Division extracted from stateJurisdiction
zone : Zone extracted from stateJurisdiction
circle : Circle/office name extracted from stateJurisdiction (without "Jurisdictional Office" suffix)
Added new fields to dataset schema with proper descriptions
Updated business_details view in dataset schema to include the new jurisdiction fields
Added state filter option in business_details view
Added missing fields to dataset schema: adhrVdt (Aadhaar validation date) and district
Enhanced transformTaxpayerDetailsKeys function in scraper to parse jurisdiction string
Updated README.md to document the new jurisdiction fields
Original stateJurisdiction field is retained for reference alongside the new parsed fields
Fixed dataset schema validation : Made array schemas flexible to allow variable-length arrays instead of fixed-length tuples
Updated schema to allow null values for optional fields
Parser handles jurisdiction strings in format: "State - [NAME],Division - [NAME],Zone - [NAME],Circle - [NAME] (Jurisdictional Office)"
Fields are set to null if the corresponding component is not found in the jurisdiction string
Backward compatible: existing functionality remains unchanged
Schema now supports variable-length arrays for businessActivities , finanacialYears , filingStatus , filingFrequency , and goodservice.services
[1.0.0] - Initial Release
Initial release of GSTIN Scraper
Comprehensive GST data extraction from GST portal
Taxpayer details extraction (legal name, trade name, registration details, etc.)
HSN codes and services information extraction with SAC codes
Filing status and history for all financial years
Financial years data and filing frequency preferences
Flexible data extraction with extractHsnCodes and extractFilingDetails parameters
Usage-based billing system with transparent per-item charging
Captcha handling with retry logic
Error handling and validation for GSTIN format
Dataset views for overview, business details, goods & services, and filing details