Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[Unreleased]
Added
- State Jurisdiction Parsing: The
stateJurisdiction field is now automatically parsed into separate fields for better data organization and filtering:
state: State name extracted from stateJurisdiction
division: Division extracted from stateJurisdiction
zone: Zone extracted from stateJurisdiction
circle: Circle/office name extracted from stateJurisdiction (without "Jurisdictional Office" suffix)
- Added new fields to dataset schema with proper descriptions
- Updated business_details view in dataset schema to include the new jurisdiction fields
- Added state filter option in business_details view
- Added missing fields to dataset schema:
adhrVdt (Aadhaar validation date) and district
Changed
- Enhanced
transformTaxpayerDetailsKeys function in scraper to parse jurisdiction string
- Updated README.md to document the new jurisdiction fields
- Original
stateJurisdiction field is retained for reference alongside the new parsed fields
- Fixed dataset schema validation: Made array schemas flexible to allow variable-length arrays instead of fixed-length tuples
- Updated schema to allow null values for optional fields
Technical Details
- Parser handles jurisdiction strings in format: "State - [NAME],Division - [NAME],Zone - [NAME],Circle - [NAME] (Jurisdictional Office)"
- Fields are set to
null if the corresponding component is not found in the jurisdiction string
- Backward compatible: existing functionality remains unchanged
- Schema now supports variable-length arrays for
businessActivities, finanacialYears, filingStatus, filingFrequency, and goodservice.services
[1.0.0] - Initial Release
Added
- Initial release of GSTIN Scraper
- Comprehensive GST data extraction from GST portal
- Taxpayer details extraction (legal name, trade name, registration details, etc.)
- HSN codes and services information extraction with SAC codes
- Filing status and history for all financial years
- Financial years data and filing frequency preferences
- Flexible data extraction with
extractHsnCodes and extractFilingDetails parameters
- Usage-based billing system with transparent per-item charging
- Captcha handling with retry logic
- Error handling and validation for GSTIN format
- Dataset views for overview, business details, goods & services, and filing details