Why Indonesian Address Validation Is So Damn Hard
Address validation seems straightforward until you try implementing it for Indonesia. Users enter their addresses in wildly different formats. The same street has multiple spelling variations. Postal codes exist but many people don’t know theirs. Buildings have informal names that shipping providers use but don’t appear in official databases. Good luck making sense of it all.
After building address validation for three Indonesian e-commerce platforms, I’ve learned what’s possible, what isn’t, and what workarounds actually help. Here’s the reality.
Why Postal Codes Aren’t Enough
In countries like the US, postal code alone identifies a small geographic area. In Indonesia, one postal code might cover an entire kelurahan with thousands of buildings across multiple streets. The postal code narrows location somewhat but doesn’t uniquely identify an address.
This means you need the full hierarchical breakdown—province, kabupaten/kota, kecamatan, kelurahan, postal code—plus street and building information. Missing any piece makes validation unreliable.
Worse, many Indonesians don’t know their postal code. They know their address by landmarks and neighborhood names but haven’t memorized a five-digit code they rarely use. Requiring postal code entry creates friction when users have to look it up separately.
Address Format Variations
There’s no standard Indonesian address format. Some people write street first, others write neighborhood first. Some include RT/RW (neighborhood administrative divisions), others don’t. Some spell out “Jalan,” others abbreviate to “Jl.”
Real examples from our database:
- “Jl. Sudirman No. 123, RT 01/RW 05, Kelurahan Senayan, Kecamatan Kebayoran Baru”
- “Komplek Perumahan Graha Indah Blok B No. 12, Kelapa Gading”
- “Gang Mawar 3, dekat warung Pak Budi”
These all need to parse into structured data for validation and shipping. Free-form text boxes work better than rigid structured forms, but then you’re parsing natural language with huge variation.
The RT/RW Problem
RT (Rukun Tetangga) and RW (Rukun Warga) are neighborhood administrative units smaller than kelurahan. They’re important for local identification but not standardized in databases. Postal codes don’t break down by RT/RW.
Some shipping providers use RT/RW for routing. Others ignore them. Users often include RT/RW because that’s how they think about their address. Your system needs to capture it but can’t validate it against any official database because comprehensive RT/RW databases don’t really exist.
Informal Place Names
“Dekat warung Pak Budi” (near Pak Budi’s shop) appears in addresses. So do “seberang alfamart” (across from Alfamart) and “belakang masjid” (behind the mosque). These landmarks are more meaningful to local delivery drivers than official street names.
Your validation system can’t recognize these, but they’re crucial for deliverability. Too-strict validation that strips out landmark descriptions makes addresses less useful despite appearing more “correct” in your database.
Street Name Variations
The same street might be written as “Jalan Sudirman,” “Jl. Sudirman,” “Jl Sudirman” (no period), or just “Sudirman.” Validation needs to recognize these as equivalent. But simple string matching fails because “Jalan Kebon Sirih” and “Kebon Sirih” could be different locations (one the street, one the neighborhood).
Diacritics and special characters add confusion. Is it “Kelapa Gading” or “Klp. Gading”? Your validation might know one spelling but users enter another. Building a comprehensive mapping of variations is never complete.
Building Names and Numbers
Official building numbers often don’t match what’s painted on buildings. A house might officially be Jl. Sudirman No. 123 but everyone calls it “Rumah Pak Hadi” (Mr. Hadi’s house). Delivery drivers know it by the informal name.
Apartment buildings present special challenges. “Apartemen Taman Rasuna Tower 18, Unit 2506” needs to parse into building name, tower/block, unit number. But users enter this in countless formats. Validation that rejects addresses because they don’t fit expected patterns creates support headaches.
Geographic Data Quality
Google Maps location data for Indonesia varies by area. Major cities are generally accurate. Remote areas can be off by hundreds of meters or more. Addresses in new developments might not appear in mapping databases yet.
Coordinate validation—checking if a pin drop matches the entered address—works in some contexts but fails where map data is incomplete. Users get frustrated when the system insists they’re in the wrong location based on inaccurate maps.
Language and Spelling Issues
Indonesian spelling isn’t perfectly standardized for place names. Regional language influences appear—Javanese, Sundanese, Balinese names that might be spelled different ways. Validation assuming Indonesian spelling rules fails for these addresses.
English language corruption of Indonesian names creates variations. Someone might write “Jakarta” or “Djakarta.” Both refer to the same place but string matching fails. Historical spellings persist even after official names change.
What Actually Works
Structured autocomplete helps—as users type, suggest matches from your database. This guides them toward validated addresses while allowing flexibility for novel entries. The suggestions improve data quality without being dictatorial about format.
Allow free-form detail fields for landmarks and navigation instructions. Have validated structure fields (province, city, district, postal code) plus an “additional details” text field where users enter whatever helps deliveries arrive. This balances structure with reality.
Phone number validation matters more than address validation in Indonesian context. If the address is ambiguous, delivery drivers call to ask directions. A working phone number often matters more than a perfect address for successful delivery.
Shipping Provider Integration
Each shipping provider has its own location database and addressing requirements. What JNE accepts might differ from what SiCepat accepts. Cross-referencing multiple providers’ databases catches issues your own validation misses.
Some platforms send the same address to multiple shipping providers and let them compete for the order. The providers’ own validation and route optimization determines who can deliver. This outsources validation to specialists rather than trying to solve everything internally.
Learning From Delivery Failures
Track which addresses cause delivery failures. These reveal patterns—certain neighborhoods where addresses are systematically wrong, formatting issues that confuse drivers, or postal codes that don’t match actual locations. This feedback loop improves validation over time.
When delivery fails, capture the driver’s notes. “Rumah tidak ada” (house doesn’t exist) versus “nomor salah” (wrong number) versus “penerima tidak dapat dihubungi” (recipient unreachable) indicate different problem types. This data trains better validation.
User Education
Sometimes the problem is user error—they enter their neighbor’s address by mistake or leave out crucial details. Clear labeling and examples help. “Include landmarks like nearby shops or buildings” tells users what to add.
Validation messages should be helpful, not just error notices. Instead of “Invalid address,” try “We couldn’t find this kelurahan in Tangerang. Did you mean [suggestion]?” Guide users toward correct entry rather than just rejecting wrong entry.
The Simplicity Tradeoff
Simpler validation means more incorrect addresses enter your system. Stricter validation means users can’t complete checkout because their perfectly deliverable address doesn’t match your database. Finding the right balance depends on your business.
High-value orders might justify stricter validation and human review. Low-value orders benefit from frictionless checkout even if it means some delivery failures. COD (cash on delivery) orders can fail delivery and reattempt—different risk profile than prepaid orders.
Technical Implementation
Store addresses in both structured and unstructured formats. Parse user input into structured fields when possible but also preserve original text. This lets you implement validation against structured data while allowing delivery providers to work with full unstructured text.
Use address parsing libraries but don’t trust them completely. They work for common patterns but fail on edge cases. Have human review processes for flagged addresses rather than automated rejection.
Consider machine learning for address parsing and validation. Models trained on your specific data might outperform rule-based parsing. But ML requires substantial training data and ongoing refinement. Worth considering at scale, probably overkill for smaller platforms.
When to Give Up
Sometimes perfect validation isn’t worth the cost. If your delivery success rate is 95% with simple validation and making it 97% would cost six months of development, maybe 95% is good enough. Diminishing returns kick in fast.
Edge cases are infinite. You’ll never handle every weird address format Indonesians use. Focus validation on common cases and let humans handle exceptions. Customer service reviewing 5% of addresses costs less than engineering perfect validation.
Indonesian address validation is hard because Indonesian addresses are inherently messy. They evolved organically without central planning or standardization. Any system trying to impose rigid structure fights against social reality. The most practical approach accepts messiness, provides helpful guidance, validates what’s validatable, and builds processes to handle the rest.