AWS Bedrock doesn't have a visibility problem. It has a usability problem.
Teams are already using inference profiles to structure their AI usage — assigning profiles by application, workflow, or even individual engineer. It's clean, intentional, and governable. Until you open AWS CUR.
In the Cost and Usage Report, all of that carefully designed structure collapses into a single, complex resourceId string — a long ARN like arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.meta.llama3-1-70b-instruct-v1:0/internal-genai-app-production-v2. Technically the data is there. Practically, it's unusable for filtering, grouping, or any kind of cost analysis.
As Bedrock usage scales, costs get real fast. Teams that started with a shared inference profile quickly evolve to more granular setups — profiles per application, per environment, sometimes per engineer — managed via infrastructure-as-code and distributed as part of runtime configuration.
But the first question these teams ask isn't "which profile?" — it's sharper than that:
Which Bedrock costs went through an inference profile at all?
That distinction becomes the dividing line. Spend that flows through inference profiles is usually owned and expected. Spend that doesn't is a question mark. That's the first filter every FinOps team wants. Everything else — chargebacks, optimization, anomaly detection — comes after.
Teams have tried to approximate this with tags and manual filters. It kind of works — but it's fragile. Tags depend on naming conventions. Filters require manual stitching. Edge cases slip through. And validation still means digging into other tools.
It answers the question, but not in a way you'd trust day to day.
When a request goes through an inference profile, AWS generates a unique ARN in the CUR's resourceId column:
arn:aws:bedrock:[region]:[account-id]:inference-profile/[profile-id]/[application-suffix]
Two pieces matter here:
Every request that uses a profile leaves this trace. Every request that doesn't — doesn't. But as long as that signal stays trapped inside an ARN, it's effectively invisible for cost analysis.
Even once you can see inference profiles, there's an important distinction:
They look similar in raw data. They behave very differently in practice. If you're trying to understand ownership, allocation, or anomalies, that difference matters.
One high-growth AI engineering team was spending $600/day on Bedrock for tool-assisted coding. By assigning individual inference profiles to each engineer via Terraform, their FinOps team was able to:
Beyond licensing, the same data unlocks internal chargebacks based on actual inference usage, and catches "rogue" spend — any usage that bypasses inference profiles entirely.
We've built this directly into Finout. No custom scripts to parse ARNs. No hunting for untracked spend in CloudWatch logs.
Finout now automatically extracts metadata from the Bedrock resourceId to provide two new dimensions:
Build a Finout dashboard that shows:
You can filter, group, slice, and validate — without stitching together workarounds or jumping between tools.
AI cost allocation is changing. It's no longer just account, service, and tag. It's about which path a request took, which control layer it passed through, and whether it was even supposed to happen.
Inference profiles are one of those layers. If they stay buried in an ARN, you have the data — but not the insight. And that's the difference between tracking spend and actually managing it.