Anyone who runs a server with SSH exposed to the internet sees the same pattern in the logs. A steady stream of automated scanners tries to log in, hour after hour, from addresses all over the world. The common picture of what comes next has an attacker landing a shell, looking around the system, and typing commands. The reality recorded across eleven research honeypots looks almost nothing like that.

Eleven SSH honeypots ran on cloud servers in Frankfurt, Germany, for fifteen days in late May and early June, in a study by researchers at the Czech Technical University in Prague. Together they logged 177,622 authenticated sessions, every one an attacker who got past the login.

The split among those sessions was lopsided. Non-interactive sessions, where a client logs in, runs one command, reads the output, and disconnects, accounted for 99.23% of the total.

Interactive shell sessions, the kind the honeypot field was built to study, came to 0.10%. File transfers made up the small remainder. A non-interactive session works in a particular way. The client authenticates, issues a single command through an SSH exec request, and the server closes the channel without allocating a terminal. The whole exchange finishes in under a second, faster than a person can type. These are scanners and exploit scripts running at machine speed, logging in to confirm a fact about the host and moving on.

A result that holds on someone else’s sensors

The honeypots ran on a modified version of an open-source tool called AdvancedShelLM, which uses a large language model to generate realistic shell output. A locally hosted model handled most of the sessions, with two OpenAI models as backup. The backend governed the responses the honeypots returned. The measurement concerned the traffic attackers sent, which the model does not influence.

To confirm the result held beyond their own deployment, the researchers compared it against an independent dataset from CZ.NIC, the operator of a honeypot service built on thousands of Cowrie sensors. That dataset held more than a quarter of a million logged-in sessions over the same window. Among sessions that carried at least one command, 92.67% carried exactly one. The pattern held on hardware run by a different operator.

Most of the traffic is reconnaissance

The ten most common non-interactive commands covered 41.59% of that traffic, and most of them gather basic facts about the machine. Variants of uname, which reports the operating system and kernel, sat at the top of the list. Others asked for the processor count, the logged-in user, the graphics hardware, and the system uptime. These commands collect information that tells an automated campaign whether the box is worth a second look.

Some scanners check whether you are the trap

A smaller group of commands had a different job. Some scanners test whether the thing answering them runs commands for real. The team recorded 2,178 sessions of this kind. One campaign sent a base64-encoded string and decoded it, an operation that returns a known answer on a working system. Others asked for simple arithmetic, dumped the contents of a binary, or wrote a file and read it back.

This carries weight for the newer class of honeypots built on language models. A model can produce shell output that looks plausible and is wrong. A scanner that checks the math, decodes the string, or confirms that a file persists catches the difference in a single command. Success for these honeypots comes down to surviving that check.

Honeypot operators fingerprinting attackers is old ground. The reverse showed up here as well. A handful of sessions looked for the tells of known honeypots, listing processes for Cowrie or kippo and testing whether system files were writable. The counts were small, and the authors treat them with care.

Scanners tested outputs and machine state. The team screened every session for prompt-injection strings and for mentions of AI or model names, and found none of either. Worry about attackers talking their way past a language model has little support in this data, at least for now.

A habit that settled years ago

The historical record points to a settled behavior. CZ.NIC’s archive runs back to 2017 and holds more than 400 million sessions, and non-interactive traffic has been the majority since around 2018. One sharp move came in October 2024, when the non-interactive share climbed to 97.4% in a single month, a jump of more than seventeen points, alongside a spike in total volume.

The result carries a warning for how honeypots get judged. Many designs measure success by engagement, counting how long an attacker stays and how many commands they run. A traffic stream made almost entirely of single sub-second commands gives those metrics little to work with. A honeypot that only offers an interactive shell, and refuses non-interactive requests, records a version of attacker behavior the honeypot itself created.

The login attempts filling the logs are mostly triage. An automated client confirms the host is real, files it for later, and leaves. The value sits in recognizing that pattern and grouping the noise into campaigns, so a thousand one-second touches resolve into the handful of operations behind them.

Demo: Prophet Agentic AI SOC Platform transforms alert triage and investigation

Leave a Reply