Yᴏᴜʀ Pʀᴏᴅᴜᴄᴛ
ʜᴇʀᴇ!
Add a link to your company or project here:
purchase a GitHub sponsorship.
The logic of the world is prior to all truth and falsehood.
—
Ludwig
Wittgenstein[1]
A curated
list of
falsehoods programmers believe in. A falsehood is an
idea that you initially believed was true, but
in reality, it is proven to be false.
E.g. of an idea: valid email address exactly has one
@ character. So, you will use this rule to implement your
email-field validation logic. Right? Wrong! The reality is:
emails can have multiple @ chars. Therefore your
implementation should allow this. The initial idea is a
falsehood you believed in.
The falsehood articles listed below will have a
comprehensive list of those false-beliefs that you should be aware of,
to help you become a better programmer.
Contents
Arts
Business
- Falsehoods
about Online Shopping - Covers prices, currencies and
inventory.
- Falsehoods about
Prices - Covers currencies, amounts and localization.
- Falsehoods
about IBANs - International Bank Account Numbers are not
international.
- Falsehoods
about Economics - Economics are not simple or rational.
- Decimal
Point Error in Etsy’s Accounting System - The importance of types in
accounting software: missing the decimal point ends up with 100x
over-charges.
- Twenty
five thousand dollars of funny money - Same error as above at Google
Ads, or the danger of separating your pennies from your dollars, where
$250 internal coupons turned into $25,000. My advice: get rid
of integers and floats for monetary values. Use decimals. Or fallback to
strings and parse them, don’t validate.
- Characters
< and > in company names lead to XSS
attacks - Because UK
allows companies to be registered with special characters, a hacker
leveraged them to register
\"><SCRIPT SRC=MJT.XSS.HT></SCRIPT> LTD, but
also ; DROP TABLE "COMPANIES";-- LTD,
BETTS & TWINE LTD and
SAFDASD & SFSAF \' SFDAASF\" LTD.
- Minutiae
of company names - How the rules of the State of Delaware and the
IRS does not intersects.
- CLDR
currency definitions - Currency validity date ranges overlap due to
revolts, invasions, new constitutions, and slow planned adoption.
tax -
A PHP 5.4+ tax management library.
Cryptocurrency
Dates and Time
- Falsehoods
about Time - Seminal article on dates and time.
- More
Falsehoods about Time - Part. 2 of the article above.
- Falsehoods
about Time and Time Zones - Another takes on time-related
falsehoods, with an emphasis on time zones.
- Critique of
Falsehoods about Time - Takes on the first article above and
provides an explanation of each falsehood, with more context and
external resources.
- Falsehoods
about Unix Time - Mind the leap second!
- Falsehoods
about Time Zones - Has some nice points regarding the edge-cases of
DST transitions.
- Your Calendrical
Fallacy Is Thinking… - List covering intercalation and cultural
influence, made by a community of iOS and macOS developers.
- Time Zone Database -
Code and data that represent the history of local time for many
representative locations around the globe.
- The Long, Painful History
of Time - Most of the idiosyncrasies in timekeeping can find an
explanation in history.
- You Advocate a Calendar
Reform - Your idea will not work. This article tells you why.
- So You Want to Abolish Time
Zones - Abolishing timezones may sound like a good idea, but there
are quite a few complications that make it not quite so.
- The Problem
with Time & Timezones - A video about why you should never, ever
deal with timezones if you can help it.
- $26,000
Overcollection by Labor Department - The consequence of wrong
calendar accounting.
- RFC-3339 vs
ISO-8601 - An giant list of formats from the two standards, how they
overlaps, and live examples.
- ISO-8601,
YYYY, yyyy, and why your year may be wrong
- String formatting of date is hard.
- UTC
is Enough for everyone, right? - There are edge cases about dates
and time (specifically UTC) that you probably haven’t thought of.
- Storing
UTC is not a silver bullet - “Just store dates in UTC” is not always
the right approach.
- How to
choose between UT1, TAI and UTC - Depends on your priorities between
SI seconds, earth rotation sync, leap seconds avoidance.
- Why
is subtracting these two times (in 1927) giving a strange result? -
Infamous Stack Overflow answer about both complicated historical
timezones, and how historical dates can be re-interpreted by newer
versions of software.
- Critical
and Significant Dates - From Y2K to the overflow of 32-bit seconds
from Unix epoch, a list of special date to watch for depending on the
system.
- “I’m going to a commune in Vermont and will deal with no unit of
time shorter than a season.” - Is the note left on his terminal by a
quitting engineer in the 70s, after too much effort toiling away on
sub-second timing concerns. Source: The
Soul of a New Machine.
Education
Emails
Geography
Human Identity
Internationalization
On character encoding, string formatting, unicode and
internationalization.
- Falsehoods
about Language - Translating a software from English is not as
straightforward as it seems to be.
- Falsehoods
about Language - Additional cases to complement the previous
article.
- Falsehoods
about Plain Text - Plain text can’t cut it, which makes Unicode even
more incredible for its ability to just work well.
- Falsehoods
about text - A subset of the falsehoods from above, illustrated with
some examples.
- Internationalis(z)ing
Code - A video about things you need to keep in mind when
internationalizing your code.
- Minimum
to Know About Unicode and Character Sets - A good introduction to
unicode, its historical context and origins, followed by an overview of
its inner working.
- Awesome
Unicode - A curated list of delightful Unicode tidbits, packages and
resources.
- Dark
corners of Unicode - Unicode is extensive, here be dragons.
- Let’s
Stop Ascribing Meaning to Code Points - Dives deeper in Unicode and
dispels myths about code points.
- Unicode
misconceptions - A collection of falsehoods on case, encodings,
string length, and more.
- Breaking
Our
Latin-1 Assumptions - Most programmers spend so
much time with Latin-1 they forgets about other’s scripts
quirks.
- Ode to a shipping label
- Character encoding is hard, more so when each broken layer of data
input adds its own spice.
- Localization
Failure: Temperature is Hard - You cannot localize temperature
differences as-is.
- i18n Testing
Data - Compilation of real-word international and diverse name data
for unit testing and QA.
- Big List
of Naughty Strings - A huge corpus of strings which have a high
probability of causing issues when used as user-input data. A must have
set of practical edge-cases to test your software against.
Management
- Falsehoods
about Video - Cover it all: video decoding and playback, files,
image scaling, color spaces and conversion, displays and subtitles.
- Horrible
edge cases to consider when dealing with music - Music catalogs data
are full of crazy stuff.
- MusicBrainz
database schema - An open-source project and database that seems to
have solved the complexity of music catalog management.
- DDEX - The industry
standard for music metadata, including archiving, sound recording, sales
and usage reporting, royalties and license deals.
- Apple
Music Style Guide - Quality insurance guidelines to format music,
art, and metadata to increase discoverability.
Networks
Phone Numbers
Postal Addresses
- Falsehoods
about Addresses - Covers streets, postal codes, buildings, cities
and countries.
- Falsehoods
about Residence - It’s not only about the address itself, but the
relationship between a person and its residence.
- Letter
Delivered Despite No Name, No Address - Ultimate falsehood about
postal addresses: you do not need one.
- UK
Address Oddities - Quirks extracted from a list of most residential
property sales in England and Wales since 1995.
- The Bear
with Its Own ZIP Code - Smokey Bear has his own ZIP Code
(
20252) because he gets so much mail.
- Why
doesn’t Costa Rica use real addresses? - Costa Rican uses an
idiosyncratic system of addresses that relies on landmarks, history and
quite a bit of guesswork.
- Regex
and Postal Addresses - Why regular expressions and street addresses
do not mix.
- Parsing the
Infamous Japanese Postal CSV - “I saw many horrors, but I’ve never
seen this particular formatting choice anywhere else.”
- USPS Postal
Addressing Standards - Describes both standardized address formats
and content.
libaddressinput
- Google’s common C++ and Java library for parsing, formatting, and
validating international postal addresses.
addressing
- A PHP 5.4+ addressing library, powered by Google’s dataset.
postal-address
- Python module to parse, normalize and render postal addresses.
address -
Go library to validate and format addresses using Google’s dataset.
Science
Society
Software Engineering
- Falsehoods
about Versions - Attributing an identity to a software release might
be harder than thought.
- Falsehoods
about Build Systems - Building software is hard. Building software
that builds software is harder.
- Falsehoods
about Undefined Behavior - Invoking undefined behavior can cause
anything to happen, for a much broader definition of “anything”
than one might think.
- Myths
about CPU Caches - Misconceptions about caches often lead to false
assertions, especially when it comes to concurrency and race
conditions.
- Falsehoods
about null pointers - Null pointers are even more cursed than
pointers in general, and provenance already makes pointers quite
complicated.
- Falsehoods
about CSVs - While RFC4180 to exists, it is far from definitive and
goes largely ignored.
- Falsehoods
about Package Managers - Covers package and their managers.
- Falsehoods
about Testing - An attempt to establish a list of falsehoods about
testing.
- Falsehoods
about Search - Why search (including analysis, tokenization,
highlighting) is deceptively complex.
- What
every software engineer should know about search - A better sourced
article on the difficulty of implementing search engines.
- Falsehoods
about Pagination - Why your pagination algorithm is giving someone
(possibly you) a headache.
- Falsehoods
about garbage collection - Misconceptions about the predictability
and performance of garbage collection.
- Myths
about File Paths - Diversity of file-systems and OSes makes file
paths a little harder than we might think of.
- The
weird world of Windows file paths - “On any Unix-derived system, a
path is an admirably simple thing: if it starts with a
/,
it’s a path. Not so on Windows.”
- Myths about
/dev/urandom - There are a few things about
/dev/urandom and /dev/random that are repeated
again and again. Still they are false.
- Facts
about State Machines - State machines are often misunderstood and
under-applied.
- Hi! My name
is… - This talk could have been named falsehoods about usernames
(and other identifiers).
- Popular misconceptions
about
mtime - Part of a post on why file’s
mtime comparison could be considered harmful.
- Rules
for Autocomplete - Not falsehoods per se, but still a great
list of good practices to implement autocompletion.
- Floating Point Math -
“Your language isn’t broken, it’s doing floating point math. (…) This is
why, more often than not,
0.1 + 0.2 != 0.3.”
- The
yaml document from hell - YAML is full of obscure complexity like
accidental numbers and non-string keys.
- I am
endlessly fascinated with content tagging systems - There are
edge-cases even in tagging systems which are supposed to be
barebone.
- Falsehoods
about Quantum Technology - Common misconceptions about quantum
technology and computers.
- Falsehoods about
Event-Driven Systems - Misconceptions about event driven systems and
message passing.
- Falsehoods
about Digital Object Identifiers (DOIs) - False conceptions about
the identifiers that are used to identify and link research outputs (and
a lot of other things).
- Falsehoods
about CVE - CVE ≠ vulnerability (and 36 other confusions).
- Falsehoods
about authorization - Misconceptions about implementing permissions
systems.
Transportation
Typography
Video Games
- The
Door Problem - All the things you have not considered implementing
for your doors in games.
Web
Contributing
Your contributions are always welcome! Please take a look at the contribution
guidelines first.
This list gathered some popularity in social medias over the past few
years. See it being discussed
and mentioned elsewhere.
The header
image is based on a modified photo
taken in February 2010 by Iza Bella, distributed under a Creative
Commons BY-SA 2.0 UK license.
[1]: Notebooks,
1914-1916 (Liveright, 2022) - source:
page 14e. [↑]
falsehood.md
Github