LLM infrastructure is sprawling across corporate networks faster than security teams can track it. Developers spin up Ollama instances for local testing. Teams deploy LiteLLM proxies to unify model access. Open WebUI installations give employees ChatGPT-like interfaces without IT oversight. The result: LLM services are proliferating everywhere, from internal networks to internet-facing endpoints, often without proper security controls.
This creates opportunity for attackers. An unsecured LLM endpoint is a high-value target for running up compute bills, exfiltrating training data, or pivoting deeper into infrastructure. But identifying these services isn’t straightforward. Traditional banner grabbing tells you a port is open. It doesn’t tell you that the service behind it speaks the Ollama API, what models are loaded, or how to interact with it.
Today, we’re releasing Julius, an open-source tool built specifically for LLM service fingerprinting.
What Julius Does
Julius takes a target URL and identifies exactly what LLM service is running behind it. Point it at an endpoint, and Julius identifies the software, extracts available models, and shows you how to start interacting with the service immediately.
Julius ships with probes for popular LLM services out of the box:
- Self-hosted inference: Ollama, llama.cpp, LocalAI, LM Studio, vLLM
- Inference proxies: LiteLLM, Kong AI Gateway
- Chat interfaces: Open WebUI, LibreChat, SillyTavern
- Enterprise platforms: NVIDIA NIM, HuggingFace TGI
- Generic detection: OpenAI-compatible API endpoints
When Julius identifies a service, it doesn’t stop at the name. For services with model enumeration endpoints, it extracts the list of available models. For several services, it provides ready-to-use configuration showing exactly how to interact with the discovered endpoint.
Design Philosophy
We built Julius to do one thing well: go from an IP and port combination to actionable intelligence about LLM infrastructure.
Every probe is defined in a single YAML file. This keeps the detection logic readable, self-contained, and easy to extend. Security researchers can add support for new services without diving into Go code. The format is simple enough that both humans and LLMs can understand and modify it.
Julius also caches HTTP responses during a scan, so multiple probes targeting the same endpoint don’t result in duplicate requests. You can write 100 probes that check / for different signatures without overloading the target. Julius fetches the page once and evaluates all matching rules against the cached response.
Julius prioritizes precision over breadth. Each probe includes specificity scoring to avoid false positives. An Ollama instance should be identified as Ollama, not just “something OpenAI-compatible.” The generic OpenAI-compatible probe exists as a fallback, but specific service detection always takes precedence.
Installation & Setup
Julius requires Go 1.21 or later. Installation takes seconds:
go install github.com/praetorian-inc/julius/cmd/julius@latest
Once installed, Julius is ready to use immediately with no additional configuration required. For advanced usage scenarios, check the repository’s README for optional configuration flags and output formats.
Usage Examples
Basic service identification:
julius probe https://target.example.com:11434
Scan multiple targets from a file:
julius probe -f targets.txtJSON output for automation:
julius probe https://target.example.com -o jsonJulius integrates naturally into existing security workflows. Use it during reconnaissance to identify LLM services, include it in continuous asset discovery pipelines, or integrate the JSON output into your security orchestration platform.
What's Next
Julius 1.0 focuses on HTTP-based fingerprinting of known LLM services. We’re already working on expanding its capabilities while maintaining the lightweight, fast execution that makes it practical for large-scale reconnaissance.
On our roadmap: additional probes for cloud-hosted LLM services, smarter detection of custom integrations, and the ability to analyze HTTP traffic patterns to identify LLM usage that doesn’t follow standard API conventions. We’re also exploring how Julius can work alongside AI agents to autonomously discover LLM infrastructure across complex environments.
Contributing & Community
Julius is available now under the Apache 2.0 license at https://github.com/praetorian-inc/julius
We welcome contributions from the community. Whether you’re adding probes for services we haven’t covered, reporting bugs, or suggesting new features, check the repository’s CONTRIBUTING.md for guidance on probe definitions and development workflow.
Ready to start? Clone the repository, experiment with Julius in your environment, and join the discussion on GitHub. We’re excited to see how the security community uses this tool in real-world reconnaissance workflows. Star the project if you find it useful, and let us know what LLM services you’d like to see supported next.
See Praetorian in Action
Request a 30-day free trial of our Managed Continuous Threat Exposure Management solution.
