# Amazon Textract — Phase 1 Enterprise Access Hardening (PrivateLink + Org Guardrails + Endpoint Policy)
**Scope (Phase 1 only):**
This design focuses on the *three foundational controls* required to expose Amazon Textract safely across application teams:
1) **Network boundary:** Interface VPC Endpoint (AWS PrivateLink) + Private DNS
2) **Org-level enforcement:** SCP / Permission Boundary to **deny non-VPCE Textract calls**
3) **Ingress gate:** VPC Endpoint Policy to control **who may use the endpoint**
> Async job orchestration (SNS/SQS, result storage patterns, etc.) is explicitly **out of scope** for Phase 1 and will be addressed in Phase 2.
---
## 1. Why this is needed (and why IAM-only is not enough)
### IAM-only is necessary but not sufficient
Identity-based IAM policies answer: **“who is allowed to call Textract?”**
They do **not** reliably enforce: **“from where can Textract be called?”**
If credentials are abused (e.g., SSRF, role credential theft, CI misuse), IAM-only typically still permits calling Textract from *outside your controlled network* unless you add a network-bound guardrail.
### What Phase 1 adds (defense-in-depth)
This design ensures Textract can only be used:
- **From inside approved VPCs via a specific VPC Endpoint (VPCE)**
- **By approved principals only (endpoint policy allowlist)**
- **With org-level enforcement (SCP/boundary) that teams cannot bypass**
AWS explicitly supports accessing Textract via interface endpoints and Private DNS using the default regional DNS name.
See “Amazon Textract and interface VPC endpoints” (Textract Developer Guide). :contentReference[oaicite:0]{index=0}
---
## 2. Target state (control stack overview)
### Control 1 — Network Boundary (default private access)
- Create **Interface VPC Endpoint** for Textract:
- `com.amazonaws.<region>.textract` (optional FIPS: `textract-fips`)
- Enable **Private DNS**
- Apps keep using the standard endpoint DNS name (no code changes), but DNS resolves to VPCE private IPs
AWS Textract docs confirm:
- Textract supports interface endpoints (PrivateLink)
- Private DNS allows using the default DNS name (e.g., `textract.us-east-1.amazonaws.com`). :contentReference[oaicite:1]{index=1}
### Control 2 — Org-Level Enforcement (hard deny for non-VPCE calls)
- Apply **SCP** (preferred) or **Permission Boundary**
- Deny `textract:*` unless request context contains the expected `aws:SourceVpce`
SCPs define “permission guardrails” across accounts and don’t grant permissions themselves. :contentReference[oaicite:2]{index=2}
### Control 3 — Endpoint Policy Gate (who can use the endpoint)
- Attach a **VPC Endpoint Policy** to the Textract interface endpoint
- Allowlist approved IAM roles/principals
- Endpoint policy doesn’t replace IAM; both must allow the request
Endpoint policy definition (resource-based policy attached to a VPC endpoint): :contentReference[oaicite:3]{index=3}
---
## 3. Detailed design
### 3.1 Network boundary — Interface VPCE + Private DNS
#### Design decisions
- **Interface VPCE (PrivateLink)** is mandatory for production VPCs that use Textract.
- **Private DNS enabled** is mandatory to avoid app-side endpoint overrides and keep SDK usage standard.
#### What this achieves
- Workloads in private subnets can reach Textract without requiring IGW/NAT/public IPs (removes broad egress dependency).
- Textract calls traverse AWS private networking through the VPCE entry point.
Textract service name and Private DNS behavior are documented here: :contentReference[oaicite:4]{index=4}
#### Required infra artifacts (Terraform-managed)
- `aws_vpc_endpoint` (type = `Interface`)
- Subnets: dedicated endpoint subnets (or shared with app subnets)
- Security group: restrict inbound to app workloads and limit egress as needed
- `private_dns_enabled = true`
- Endpoint policy (see §3.3)
> Note: VPCE is the *access path*, not the *security boundary alone*. Enforcement comes from SCP/boundary + endpoint policy.
---
### 3.2 Org-level enforcement — SCP / Permission Boundary (deny non-VPCE)
#### Recommendation: SCP over boundary
- **SCP** is the strongest control: centrally enforced at OU/account level, app teams can’t loosen it.
- Permission boundaries can be used in environments without Organizations or for special cases, but SCP is preferred for enterprise guardrails.
AWS Organizations SCP overview: :contentReference[oaicite:5]{index=5}
SCP examples and syntax guidance: :contentReference[oaicite:6]{index=6}
#### Enforcement pattern
**Deny** Textract calls unless they come from the approved VPCE ID.
`aws:SourceVpce` is a standard global condition context key used to limit access to a specified VPC endpoint (AWS references this approach broadly, including service best practices). :contentReference[oaicite:7]{index=7}
##### SCP example (Phase 1 baseline)
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyTextractUnlessFromApprovedVPCE",
"Effect": "Deny",
"Action": "textract:*",
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:SourceVpce": "vpce-xxxxxxxxxxxxxxxxx"
}
}
}
]
}-
Any Textract call outside the approved VPC/VPCE path will fail with explicit deny.
-
This includes:
- local developer machines
- CI runners outside the controlled VPC
- workloads not configured to use the VPC endpoint path
This is intentional: it prevents credential abuse from uncontrolled networks.
Optional enhancement (future): Add break-glass exceptions or tighter source bindings (e.g., VPC-bound credential usage patterns). AWS Security Blog discusses advanced patterns to restrict where credentials can be used from. (Amazon Web Services, Inc.)
Even with a VPCE in place, without endpoint policy hardening you can unintentionally allow broad usage inside the VPC. Endpoint policies provide a second gate:
- SCP/boundary: “Requests must arrive via this VPCE”
- Endpoint policy: “Only these principals may use this VPCE”
- IAM: “These principals may call Textract actions”
Endpoint policy definition and behavior: (AWS 文档)
Use aws:PrincipalArn allowlisting (or account/OU patterns) to limit who can use this endpoint:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowApprovedRolesToUseTextractEndpoint",
"Effect": "Allow",
"Principal": "*",
"Action": [
"textract:DetectDocumentText",
"textract:AnalyzeDocument",
"textract:StartDocumentTextDetection",
"textract:StartDocumentAnalysis",
"textract:GetDocumentTextDetection",
"textract:GetDocumentAnalysis"
],
"Resource": "*",
"Condition": {
"ArnLike": {
"aws:PrincipalArn": [
"arn:aws:iam::*:role/app-*",
"arn:aws:iam::*:role/platform-textract-*"
]
}
}
}
]
}Notes:
- Keep endpoint policy aligned with your approved usage model (sync vs async can still be allowed in Phase 1; orchestration comes later).
- Endpoint policy does not override IAM; both must allow the request. (AWS 文档)
- You can attach endpoint policy only when the service supports it (Textract does; AWS ML blog demonstrates this specifically). (Amazon Web Services, Inc.)
Creates
- Interface VPCE for Textract (
com.amazonaws.<region>.textract) - Private DNS enabled
- VPCE Security Group
- VPCE Endpoint Policy
Outputs
textract_vpce_idtextract_vpce_sg_idtextract_vpce_dns_entries
Textract interface endpoint requirements: (AWS 文档) General interface endpoint creation & endpoint policy support: (AWS 文档)
Delivers
- SCP JSON policy templates (deny non-VPCE Textract)
- Rollout plan and testing guidance
SCPs are guardrails; deployment should be staged and tested. (AWS 文档)
- VPCE creation, Private DNS enablement
- Endpoint policy governance
- SCP/boundary enforcement policy ownership
- Reference IAM policy sets (least privilege patterns)
- Use standard AWS SDK calls (no endpoint overrides)
- Run workloads in approved VPCs/subnets
- Request approved IAM role patterns (matching allowlist rules)
- Run Textract call from workload inside approved VPC
- Confirm success without requiring public egress/NAT dependency
- Optional: validate DNS resolves to VPCE private IPs
Private DNS behavior is described in Textract VPCE documentation. (AWS 文档)
- Attempt Textract call from a non-approved network path
- Expect explicit deny due to SCP/boundary
SCP evaluation model and deny-by-default behavior: (AWS 文档)
- Use a role NOT in allowlist inside the VPC
- Expect denial even though the call is via VPCE (endpoint policy blocks)
Endpoint policy purpose and constraints: (AWS 文档)
- PrivateLink ensures a private access path, but the service remains a managed AWS service (data is processed by AWS). The security objective is controlling network path + enforcement, not “keeping data inside the VPC boundary”.
- If some workloads must call Textract outside VPC (e.g., developer laptop), they will be blocked by design. Handle via separate non-prod policies or dedicated dev accounts/OUs.
- Standard async orchestration: SNS/SQS patterns, DLQ, retries
- Output storage patterns, KMS, lifecycle policies
- Optional “Textract Gateway Service” for rate limiting, auditing enrichment, payload sanitization
Accessed: 2026-01-19
Amazon Textract and interface VPC endpoints (Textract Developer Guide)
https://docs.aws.amazon.com/textract/latest/dg/vpc-interface-endpoints.html
Control access to VPC endpoints using endpoint policies (Amazon VPC User Guide / PrivateLink)
https://docs.aws.amazon.com/vpc/latest/privatelink/vpc-endpoints-access.html
Access an AWS service using an interface VPC endpoint (PrivateLink)
https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html
Service control policies (SCPs) - AWS Organizations
https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps.html
SCP examples - AWS Organizations
https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps_examples.html
SCP syntax - AWS Organizations
https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps_syntax.html
SCP evaluation - AWS Organizations
https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_scps_evaluation.html
AWS global condition context keys (IAM)
https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html
AWS Security Blog: Restrict where EC2 instance credentials can be used from (network-bound enforcement patterns)
https://aws.amazon.com/blogs/security/how-to-use-policies-to-restrict-where-ec2-instance-credentials-can-be-used-from/
AWS ML Blog: Using Amazon Textract with AWS PrivateLink (Textract + PrivateLink + endpoint policy)
https://aws.amazon.com/blogs/machine-learning/using-amazon-textract-with-aws-privatelink/
::contentReference[oaicite:18]{index=18}