Maps and Morse Code

I've always wanted to better articulate one of the thing that fascinates me about cartography, map-making, and geospatial data about the natural, perceivable world. I sometimes contribute to OpenStreetMap, partly because I'm drawn to the idea of categorising what we can see and hear into polygons, lines, and points on a map that communicate meaning by stripping away what we don't deem relevant. I asked Claude to help me organise and better articulate my thoughts and examples, and also look at Morse Code as an example of how analog signals can be categorised into dots and dashes, and only then can meaning be made from them.

What you're circling around is genuinely rich, and it touches on ideas from information theory, semiotics, and the philosophy of representation. Let me try to reflect it back in a more structured way, and then sharpen your examples.

The core tension you've identified

You're noticing that representation is not a passive copy - it's an active transformation. And that transformation does at least three things simultaneously:

Attenuates - it discards variation that doesn't serve the purpose of the representation
Amplifies - it makes certain patterns more legible than they are in the raw phenomenon
Creates - it can produce information that didn't exist in the source at all

Most people only think about the first one (loss). You're noticing the second and third, which is the more interesting observation.

Discretisation: the foundational operation

The thread connecting almost all your examples is discretisation - taking continuous, variable phenomena and binning them into discrete categories. This is what the lay person does when they see "a tree" rather than a specific organism. It's what the cartographer does when they classify terrain as "forest" rather than mapping individual crowns. And crucially, it's what the Morse receiver does.

A Morse receiver hears an analogue signal - varying in duration, intensity, perhaps distorted by interference. But they collapse all of that variation into just three categories: dot, dash, and silence (with silence itself having grades - between symbols, between letters, between words). The continuous is made discrete. Everything else is treated as noise, or as the carrier rather than the content.

This is important because it means the same signal can be noise to one receiver and information to another - not because the signal changes, but because one receiver has a discretisation scheme and the other doesn't. Morse Code to someone who doesn't know it isn't meaningless exactly - they could still notice "there's a pattern here" - but they can't resolve it into the categories that unlock meaning.

The codebook: where meaning lives

This gets at something Shannon's information theory deliberately sidestepped. Shannon was interested in the technical problem: can a signal be transmitted reliably? He explicitly bracketed the semantic problem: does it mean anything? But you're working in the space Shannon left aside.

Meaning in Morse Code - or in any symbolic system - requires what you might call a shared codebook: a mapping between symbol and referent that both sender and receiver possess. The dots and dashes are not the meaning. The carrier (radio wave, flashing light, sound) is not the meaning. The meaning lives in the codebook, which is held in the minds of sender and receiver and isn't present in the signal at all.

This is why you can transmit Morse Code via anything capable of binary alternation with sustained state - light, sound, electrical pulse - but you can't transmit it by knocking on a wall. The wall can only produce presence or absence of a knock (staccato), not duration. It lacks what you might call durational affordance. It can carry some letters (E is a single dot, I is two dots, S is three), but it can't carry the full code because the channel doesn't support the distinction the code depends on.

That's a beautifully concrete example of how the channel constrains what can be represented, independent of what sender and receiver know.

Emergent and synthetic information

Now to the part I think is your most interesting observation: that representation can create information.

Your contour line example is the clearest case. Contour lines do not exist in nature. Standing in a valley, you cannot see a contour line. But a topo map lets someone who has never visited that location understand the shape of the terrain - they can identify ridgelines, cliff faces, the gradient of a slope, whether a valley drains north or south. The map communicates something that is not perceptible from within the territory. It requires an overhead, abstracted perspective that no human can occupy physically.

Rock climbing grades work similarly. When someone grades a route, they're synthesising information from many ascents, across many bodies, and distilling it into a single number or letter. That grade doesn't exist in the rock. It's an emergent property of the interaction between human physiology and the rock face, expressed as a symbol. A climber who knows the grading system can look at "5.11a" and form expectations about the nature of the challenge - information that isn't present in the rock itself and can't be perceived by looking at it.

Your satellite-to-vector work is the same process. When you trace a road polygon from imagery, you're not just copying what's there - you're classifying it. You decide: this is a road, not a river. This is public, not private. This is a walking track, not an animal trail. Each of those decisions encodes knowledge and judgment that isn't in the pixel values. The semantic vector is more than a compressed version of the image - it's the image plus human interpretation, expressed in a new form.

A way to frame the whole thing

You might find it useful to think about these transformations in terms of three distinct layers:

1. The carrier - the physical medium (radio wave, paper, ink, light). It has affordances and constraints (can it sustain duration? can it vary in amplitude?). It is not the signal.

2. The signal - the pattern imposed on the carrier. It has structure (dots and dashes, pixels, contour lines) but no inherent meaning.

3. The codebook - the shared interpretive scheme that maps signal to meaning. This is where semantics lives. It exists in minds and communities, not in the signal.

Most of what you're describing - maps, Morse, tree identification, climbing grades - involves choices made at layer 2 (what to encode, how to discretise, what to treat as signal vs. noise) that are only intelligible through layer 3 (the shared conventions of cartography, telegraphy, taxonomy, or climbing culture).

And the "information created" phenomenon happens when the act of structuring the signal at layer 2 makes visible patterns that weren't legible in the raw carrier - like how contour lines make terrain shape legible in a way that the terrain itself, experienced from within, does not.

A possible synthesis

The representation of the natural world - whether as a map, a species name, a climbing grade, or a Morse transmission - is never simply a copy with detail removed. It is a purposeful transformation that:

selects which variation counts as signal and which as noise (relative to a purpose)
discretises continuous phenomena into categorical distinctions
requires a shared codebook to resolve into meaning
and can, through that structuring, produce emergent information not present in the source

The loss is real, but it's in service of legibility. And legibility is not neutral - it's always legibility for someone, for a purpose, using a shared scheme. Morse Code is noise to someone without the codebook. A topo map is noise to someone who doesn't know conventions for reading contour lines. What's signal and what's noise is always relative to the receiver.

I followed up with a question about the motivations and intent of the people who make maps, translate the natural world into digital representations, or create algorithms for automated discretisation.

The people and organisations doing this work span a wide spectrum, and their motivations shape every decision about what counts as signal - what gets named, bounded, classified, and how finely.

State and institutional actors

National mapping agencies (Ordnance Survey, USGS, Geoscience Australia, etc.) were historically the dominant producers of topographic representation. Their motivations are broadly administrative and infrastructural - they need maps that support land tenure, defence, taxation, resource management, and emergency response. Their discretisation choices reflect those purposes: cadastral boundaries matter enormously; individual trees don't. Their classifications tend toward standardisation and longevity. They need a vocabulary that works across the whole country and remains stable across decades, which means it can't be too fine-grained or too responsive to local variation.

A consequence of that institutional conservatism is that their semantic schemas often lag the landscape. A category like "forest" or "cleared land" may be decades old and not reflect current ground truth. The discretisation was done once, by a surveyor or photogrammetrist following a rulebook, and it persists.

Commercial and platform actors

Google, Apple, HERE, TomTom and similar companies are primarily motivated by routing and commercial utility. Their representation choices converge on: is this a road, what speed and direction, what's the business at this address? The natural environment almost disappears from their model except where it is an obstacle or a destination. A mountain range becomes a routing constraint. A national park becomes a polygon label. Individual landforms, vegetation types, rock exposures - irrelevant to the turn-by-turn problem they're solving.

What's interesting about these actors is they've increasingly moved toward machine learning pipelines applied to satellite and aerial imagery, sometimes supplemented by Street View-style ground-level capture. The discretisation is partly algorithmic - a model trained to detect road edges, building footprints, water bodies. The semantic schema is still human-defined (someone decided what categories the model should output), but the individual binning decisions are made by a classifier, not a person. This creates its own interesting artifacts: a classifier trained on one landscape fails at another, and it will confidently produce wrong categories at the edges of its training distribution.

Scientific and ecological actors

Remote sensing researchers, ecologists, and land management agencies work with aerial and satellite imagery to classify vegetation, map habitat, monitor land cover change. Their motivations are empirical and often longitudinal - they need categories stable enough to measure change over time. So a classification like "closed-canopy forest," "open woodland," "shrubland," "grassland" needs to be applied consistently across years or decades so that deforestation or regeneration is detectable.

Their classification schemes tend to be more explicitly principled than cartographic ones - often grounded in ecological theory about what categories are meaningful for biodiversity, fire risk, carbon accounting, or water yield. But they still involve judgment calls about where to draw the line between "open woodland" and "shrubland" that a different ecologist might draw differently.

For these actors, what you were doing - mapping exposed rock - would be described as lithological exposure mapping or bare earth classification. It's a real category in land cover schemes, though usually a residual one: "not vegetation, not water, not built." Your motivation (navigating through it) is orthogonal to their motivation (measuring soil erosion risk or habitat extent), but you'd probably end up drawing similar polygons.

Community and volunteer actors: the OSM case

OSM is unusual and worth dwelling on because it's a case where a community collectively maintains both the data model (what categories are possible) and the ground-level decisions (what gets classified as what). These are usually done by different people.

The data model - the tag schema, what keys and values are valid - emerges through community consensus, often contentious. Someone proposes a new tag (natural=bare_rock, say), others debate whether it's necessary, whether it overlaps with existing tags, whether it's mappable without ambiguity. The tag either achieves acceptance or doesn't. That process is essentially collective negotiation over what categories are real and meaningful enough to deserve a place in the schema.

Then individual mappers make local discretisation decisions within that schema. As you've noticed, motivations diverge: some are completists who want to capture everything, some are thematic, some are practically motivated like yourself - you mapped rock exposures because you had a specific navigational problem. Someone else might have mapped the same area with no rock at all because they were focused on trails.

What's notable is that OSM's model is permissive but not neutral. The tag schema encodes assumptions about what the world consists of. Some things are easy to map because the schema has rich vocabulary for them (roads have extraordinary granularity). Other things are hard to map because the schema is underdeveloped - subtle landform features, vegetation gradations, the difference between a seasonal stream and a drainage line. Mappers working in built environments and mappers working in natural environments are working with very different levels of schema support.

Specialist community actors: climbing, hiking, paddling

This is where representation departs most interestingly from cartographic convention, and for principled reasons.

Climbing crag maps are a good example of what cartographers call distorted or non-conformal projections chosen for communicative purpose rather than geometric accuracy. A topo of a crag face might represent a vertical surface as if viewed straight on - which is geometrically impossible from any real viewpoint - because the climber's problem is understanding the sequence of moves on the face, not the spatial relationship between the crag and the car park. The projection is warped to make the relevant information legible. Scale is locally adjusted so that a crux section with complex features gets expanded, while easier sections compress.

This is the same logic as a transit map - the London Underground map famously distorts geography to make station sequences and line connections legible. Everyone accepts the distortion because the map is designed for a specific task (navigating the network) and optimised for that task, not for spatial accuracy.

Your river map that stretched the waterway linearly is the same move - if you're navigating a river, your problem is understanding what you'll encounter in sequence, not where the river sits in Cartesian space. The linearity is coherent for that purpose. Your discomfort with it might come from a mismatch between the map's implied purpose and your expectation of what a map is for.

The motivational axis

If you wanted to describe these actors' motivations along some axes, you might use:

Purpose of navigation - what kind of navigation problem is the map solving? Vehicle routing, wilderness travel, scientific measurement, legal boundary establishment, recreational recreation? The purpose shapes what counts as relevant signal almost entirely.

Temporal scope - is the map meant to be accurate once, stable for decades, or updated continuously? This shapes how fine-grained discretisation can be, since finer categories are harder to keep current.

The body the map is made for - a climbing topo is made for a human body ascending a specific surface. A cadastral map is made for a legal system. A habitat map is made for an ecological model. The intended "receiver" of the representation shapes what gets encoded, just as a codebook is only meaningful to receivers who hold it.

Authority and consensus - state agencies impose a schema; OSM negotiates one; a solo climber drawing a crag topo invents one for themselves. The degree to which the schema is shared affects how widely the map can communicate meaning.

The through-line connecting all of these is that discretisation is never a purely technical act - it's a claim about what distinctions matter, and that claim is always made from somewhere, by someone, for some purpose. The rock exposure polygons you drew are meaningful to a bushwalker but would be noise to a road routing algorithm. The distinction between public road and private road matters enormously to a trail navigator and not at all to an ecologist mapping vegetation. The same visible surface, differently binned, for different receivers.