LiDAR Point Cloud Classification: the step that turns data into information

A raw LiDAR point cloud is dense, accurate, and georeferenced. It's also largely unusable — unless it's been classified. This article explores why classification should be a standard part of any LiDAR survey deliverable, what it actually involves, and what happens to the market when it gets skipped.

Paolo Corradeghini

5/11/20264 min read

You received the data from a LiDAR survey. You open the file, load the point cloud, and you're looking at millions of coloured points representing everything the sensor captured: trees, ground, roads, buildings. The data is dense, accurate, georeferenced.

And you have no idea what to do with it.

Not because you lack the skills — but because that data was never prepared to be used. It was acquired, processed, and delivered as-is: raw. And in that form, for most clients, it simply isn't useful.

This article is about LiDAR point cloud classification: what it is, why it matters, what you get when it's done properly — and what happens when it isn't done at all.

LiDAR has been democratised. The workflow hasn't.

A few years ago, LiDAR sensors were expensive tools accessible only to a handful of specialised operators. Today, prices have dropped significantly and the number of operators offering LiDAR surveys has grown.

That's a good thing. It means the technology reaches more contexts, more clients, more problems to solve.

But one aspect hasn't kept pace: data culture. Acquiring a point cloud with a LiDAR sensor has become "relatively" straightforward. Delivering a classified, ready-to-use dataset requires additional expertise and processing time — and it's often skipped.

The result: LiDAR surveys whose final output is a raw point cloud. Technically valid. Practically hard to use.

What happens when the data isn't classified

The client — an engineering firm, a public authority, a quarrying company — receives the point cloud, opens it in their software (if they have one), and finds themselves staring at an undifferentiated mass of points. Vegetation and ground mixed together. No way to extract a terrain model. No way to identify a track or a road.

After one or two experiences like this, the risk is they stop requesting LiDAR altogether.

Not because the technology doesn't work — but because they've never seen what it can do when the data is properly processed. For them, LiDAR has become complicated, expensive to handle, and difficult to interpret.

This is a sector problem, not just a project problem.

What point cloud classification actually is

Classification is the process of assigning a semantic category to every point in the cloud: ground, low vegetation, high vegetation, buildings, noise. Each class gets a code — the most widely used standard is the one defined by ISPRS — and from that point on, the data becomes queryable by category.

Want only the terrain? Filter the ground class and you get the Digital Terrain Model (DTM). Want high vegetation? Filter that class and you can analyse canopy coverage, height, density. Want everything except noise? Exclude that class and the cloud is clean.

In vegetated environments — like the quarry area shown here — classification is what separates the millions of points representing trees and shrubs from the points that represent the actual ground.

Two things that determine how well it works

The sensor matters. Not all LiDAR sensors are equally capable of detecting ground beneath vegetation. The key capability is multi-return (or multi-echo): each laser pulse can return multiple times. The first return hits the canopy; subsequent returns penetrate deeper and reach the ground. A single-return sensor only sees the vegetation. A multi-return sensor captures ground data too, even under tree cover.

This is not a minor technical detail — it's the prerequisite for producing a reliable DTM beneath the canopy.

The season matters. In late autumn, winter, or early spring — before deciduous vegetation has leafed out — the canopy is far less dense. Laser pulses penetrate more easily, more points reach the ground, and classification produces cleaner results.

In midsummer, with full canopy cover, penetration is at its minimum. Not impossible — but harder, and the resulting DTM will be less detailed in the most densely vegetated areas.

When you can plan the survey date, season is a variable worth considering.

Software helps. But often not enough.

Automatic classification algorithms are built into several point cloud processing tools. They can produce decent results on open terrain with simple morphology. In complex contexts — transition zones between vegetation and bare ground, steep slopes, quarry faces, urban environments — automatic algorithms tend to make mistakes. They classify low shrubs as ground. They miss narrow tracks hidden beneath canopy. They assign real structures to the noise class.

This isn't a failure of the software. It's simply the limit of any automated approach applied to complex data.

When this happens, the operator needs to step in: review the automatic output, correct misclassified points, integrate with visual analysis. It takes time. But it's what makes the difference between a representative dataset and one that tells you very little.

What the classified cloud reveals: this quarry

In this survey, the area of interest includes both active quarry sections — rock faces, working platforms, access roads — and forested zones between extraction levels.

Looking at aerial imagery, those vegetated areas are simply green. Nothing more to read. Tracks, secondary roads, terrain morphology: all hidden beneath the canopy.

From the classified point cloud, a different picture emerges. Road tracks under the vegetation become visible. Slopes, drainage lines, scarps — readable in detail, even where photogrammetry would never reach.

That is the real value of LiDAR over other survey methods: not the colourful point cloud itself, but the information that a classified cloud makes it possible to extract.

What to ask for when you commission a LiDAR survey

If you're on the client side — evaluating a LiDAR survey for a project or a site analysis — there's one practical step you can take to avoid receiving unusable data.

State it explicitly in the specification or agreement: the classified point cloud is part of the deliverable. Not an option. Not a "we can do it if you want". A standard requirement.

Specify the classes you need: ground, high vegetation, low vegetation, buildings, infrastructure. Specify the delivery format (LAS/LAZ with ISPRS standard classification). And if the site has significant vegetation, confirm that the sensor being used supports multi-return.

It's not an excessive technical request. It's asking for a finished dataset, not a raw one.

The responsibility sits with whoever does the survey

Point cloud classification is not an add-on the client has to ask for separately. It should be a standard part of any professional LiDAR survey in vegetated terrain — just as georeferencing is standard in any survey, not an extra.

The surveyor knows the context, has access to the raw data, and has the tools to classify it. The client usually doesn't.

If the data is delivered raw, the problem isn't the client's inability to use it. It belongs to whoever delivered an unfinished job.

Making classification a standard deliverable is in everyone's interest: the client gets usable data; the surveyor demonstrates competence; and the sector grows — because clients finally understand what LiDAR is actually capable of.nuto del mio post