Download our Latest Industry Report – Continuous Offensive Security Outlook 2026

Introducing Julius: Open Source LLM Service Fingerprinting

LLM infrastructure is sprawling across corporate networks faster than security teams can track it. Developers spin up Ollama instances for local testing. Teams deploy LiteLLM proxies to unify model access. Open WebUI installations give employees ChatGPT-like interfaces without IT oversight. The result: LLM services are proliferating everywhere, from internal networks to internet-facing endpoints, often without proper security controls.

This creates opportunity for attackers. An unsecured LLM endpoint is a high-value target for running up compute bills, exfiltrating training data, or pivoting deeper into infrastructure. But identifying these services isn’t straightforward. Traditional banner grabbing tells you a port is open. It doesn’t tell you that the service behind it speaks the Ollama API, what models are loaded, or how to interact with it.

Today, we’re releasing Julius, an open-source tool built specifically for LLM service fingerprinting.

What Julius Does

Julius takes a target URL and identifies exactly what LLM service is running behind it. Point it at an endpoint, and Julius identifies the software, extracts available models, and shows you how to start interacting with the service immediately.

Julius ships with probes for popular LLM services out of the box:

  • Self-hosted inference: Ollama, llama.cpp, LocalAI, LM Studio, vLLM
  • Inference proxies: LiteLLM, Kong AI Gateway
  • Chat interfaces: Open WebUI, LibreChat, SillyTavern
  • Enterprise platforms: NVIDIA NIM, HuggingFace TGI
  • Generic detection: OpenAI-compatible API endpoints

When Julius identifies a service, it doesn’t stop at the name. For services with model enumeration endpoints, it extracts the list of available models. For several services, it provides ready-to-use configuration showing exactly how to interact with the discovered endpoint.

Design Philosophy

We built Julius to do one thing well: go from an IP and port combination to actionable intelligence about LLM infrastructure.

Every probe is defined in a single YAML file. This keeps the detection logic readable, self-contained, and easy to extend. Security researchers can add support for new services without diving into Go code. The format is simple enough that both humans and LLMs can understand and modify it.

Julius also caches HTTP responses during a scan, so multiple probes targeting the same endpoint don’t result in duplicate requests. You can write 100 probes that check / for different signatures without overloading the target. Julius fetches the page once and evaluates all matching rules against the cached response.

Julius prioritizes precision over breadth. Each probe includes specificity scoring to avoid false positives. An Ollama instance should be identified as Ollama, not just “something OpenAI-compatible.” The generic OpenAI-compatible probe exists as a fallback, but specific service detection always takes precedence.

Installation & Setup

Julius requires Go 1.21 or later. Installation takes seconds:

go install github.com/praetorian-inc/julius/cmd/julius@latest

Once installed, Julius is ready to use immediately with no additional configuration required. For advanced usage scenarios, check the repository’s README for optional configuration flags and output formats.

Usage Examples

Basic service identification:

julius probe https://target.example.com:11434

Scan multiple targets from a file:

julius probe -f targets.txt

JSON output for automation:

julius probe https://target.example.com -o json

Julius integrates naturally into existing security workflows. Use it during reconnaissance to identify LLM services, include it in continuous asset discovery pipelines, or integrate the JSON output into your security orchestration platform.

What's Next

Julius 1.0 focuses on HTTP-based fingerprinting of known LLM services. We’re already working on expanding its capabilities while maintaining the lightweight, fast execution that makes it practical for large-scale reconnaissance.

On our roadmap: additional probes for cloud-hosted LLM services, smarter detection of custom integrations, and the ability to analyze HTTP traffic patterns to identify LLM usage that doesn’t follow standard API conventions. We’re also exploring how Julius can work alongside AI agents to autonomously discover LLM infrastructure across complex environments.

Contributing & Community

Julius is available now under the Apache 2.0 license at https://github.com/praetorian-inc/julius

We welcome contributions from the community. Whether you’re adding probes for services we haven’t covered, reporting bugs, or suggesting new features, check the repository’s CONTRIBUTING.md for guidance on probe definitions and development workflow.

Ready to start? Clone the repository, experiment with Julius in your environment, and join the discussion on GitHub. We’re excited to see how the security community uses this tool in real-world reconnaissance workflows. Star the project if you find it useful, and let us know what LLM services you’d like to see supported next.

icon-praetorian-

See Praetorian in Action

Request a 30-day free trial of our Managed Continuous Threat Exposure Management solution.

About the Authors

Evan Leleux

Evan Leleux

Evan Leleux is a Software Engineer at Praetorian focused on building scalable, distributed systems for enterprise security operations. He loves challenging problems and is always eager to learn. Evan is a Georgia Tech alumni.

Catch the Latest

Catch our latest exploits, news, articles, and events.

Ready to Discuss Your Next Continuous Threat Exposure Management Initiative?

Praetorian’s Offense Security Experts are Ready to Answer Your Questions