Docs

Filter Operators

SQL-like operators for metadata filters in bucket search — eq, ne, like, prefix, in, gt/gte/lt/lte, exists.

Bucket search accepts a filter object that narrows results by metadata. Plain scalar values are exact-match (backward compatible). Wrap a value in an operator dict to express richer predicates.

Operators

Operator	Type	Example	Notes
`eq`	any	`{"eq": "urgent"}`	Exact match. Lowered to engine fast path.
`ne`	any	`{"ne": "draft"}`	Not-equal. Post-filter.
`like`	string	`{"like": "%legal%"}`	SQL LIKE: `%` = any chars, `_` = single, `\%` / `\_` = literal. Case-insensitive.
`prefix`	string	`{"prefix": "2024-"}`	Starts-with. Case-insensitive.
`in`	array	`{"in": ["a","b","c"]}`	Membership. Max 100 entries.
`gt` / `gte`	number / ISO date	`{"gte": 0.8}`	Greater-than (or equal). Numeric coercion, ISO date string fallback.
`lt` / `lte`	number / ISO date	`{"lt": 100}`	Less-than (or equal).
`exists`	boolean	`{"exists": true}`	`true` requires field present and non-empty; `false` requires missing.

Example

bashcurl -X POST https://api.schift.io/v1/buckets/{bucket_id}/search \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "fire safety inspection cycle",
    "top_k": 10,
    "min_score": 0.6,
    "filter": {
      "tag": "urgent",
      "source_url": {"like": "%fire-safety%"},
      "filename":   {"prefix": "2024-"},
      "doc_type":   {"in": ["policy", "spec"]},
      "score":      {"gte": 0.8},
      "stage":      {"ne": "draft"},
      "author":     {"exists": true}
    }
  }'

Filters are conjunctive (AND)

Top-level operator clauses must all match. Use in for OR within a single key, or $or (below) for cross-key OR.

Cross-key OR (`$or`)

Use $or for disjunctive predicates across different metadata keys. Each arm is a full sub-filter dict and supports any operator.

json{
  "filter": {
    "doc_type": "policy",
    "$or": [
      {"severity": "high"},
      {"priority": {"in": ["P0", "P1"]}}
    ]
  }
}

The above matches docs where doc_type=policy AND (severity=high OR priority∈{P0,P1}). Nesting allowed up to 3 levels; max 16 arms per $or.

Result ranking

Filters never modify scores — they prune the candidate set. The engine fetches top_k * 3 candidates when post-filter operators are present to absorb shrinkage, then trims to top_k after filtering.

min_score

A top-level min_score (0.0–1.0) drops hits whose final score is below the threshold, applied after rerank. Useful as a hallucination defense in chatbot pipelines.

Safety

Patterns are length-capped (256 chars), wildcard count is capped (16), and in lists are capped (100 entries). Patterns are translated to anchored case-insensitive regex with all metacharacters escaped — no SQL injection surface.