← All Reports

Sense ATS Staging Regression Report

Sprint 12 — Full + Deep Regression + ATS-653 isDuplicated Validation

📅 2026-05-13 🌐 ats.staging.sensewp.com 🤖 Claude Opus 4.6 + Playwright MCP 👤 Paul Wagner
PASS
Overall Result
30
Test Areas Covered
4/4
ATS-653 Cases Passed
0
Blocking Bugs
87
Screenshots Captured
$81.77
Total API Cost

Execution Metrics & API Cost

MetricValue
Total Elapsed Time~150 minutes across 3 sessions
Session 1 — Initial Regression~60 min — screenshots regression_01–48 (48 shots)
Session 2 — Deep Regression (pipeline, docs, clone, edit, settings)~50 min — screenshots deep_01–27 (27 shots)
Session 3 — Deep Regression (AMI, interview, boards, report)~40 min — screenshots deep_28–39 (12 shots)
Total Screenshots87 (48 initial + 39 deep)
Total Tool Calls~400+ (Playwright actions, snapshots, evaluations, screenshots, GraphQL mutations)
Context Compaction Events3 (one per session at ~200K token context window)
ModelsClaude Opus 4.6 (claude-opus-4-6) + Haiku 4.5 (background tasks)

Actual Token Usage & Cost (measured via ccusage session)

Measured from Claude Code local conversation logs using npx ccusage@latest session --breakdown with LiteLLM pricing (2,707 models). Filtered to the "ats-map" session only — excludes other work done on the same day.

CategoryTokens% of TotalNotes
Input — uncached14,7490.01%Negligible — nearly everything served from cache
Output (incl. extended thinking)350,4940.29%Text, tool calls, chain-of-thought reasoning
Cache write2,223,6001.84%New content entering cache each turn + compaction resets
Cache read118,062,00097.86%Repeated context re-sent across all turns at discounted rate
Total120,651,000100%$81.77 USD

Cost Breakdown by Rate

CategoryTokensRate (Opus)Cost
Cache read
97.9% of all tokens — context re-sent each turn
118,062,000$1.875/1M$22.14
Cache write
New content per turn + compaction cache resets
2,223,600$18.75/1M$41.69
Output (incl. extended thinking)
Text, tool calls, chain-of-thought reasoning
350,494$75.00/1M$26.29
Input uncached
Cache hit rate >99.99%
14,749$15.00/1M$0.22
Total (ats-map session)120,651,000$81.77*

* Slight difference between sum of categories and reported total due to ccusage applying blended rates across Opus and background Haiku usage.

⚠️ Key Observations on Cost

  • Cache reads are 97.9% of all tokens — Claude Code aggressively caches the system prompt and conversation history. Across hundreds of API turns, the same context is re-sent but billed at the deeply discounted cache-read rate ($1.875/1M for Opus), making this by far the cheapest token category despite being the largest by volume.
  • Cache writes are the biggest cost driver ($41.69 / 51% of total cost) despite being only 1.84% of tokens — at $18.75/1M, each turn's new content (assistant output + tool results) is written to cache at a premium. Compaction events invalidate the cache, forcing large re-writes.
  • Output is the second biggest cost ($26.29 / 32% of total cost) at $75/1M for Opus — this includes extended thinking tokens (chain-of-thought reasoning). Only 350K tokens but expensive per-token.
  • Uncached input is nearly zero (14,749 tokens / $0.22) — tool results from previous turns become cached context on subsequent turns, so almost nothing is uncached.
  • Session-level measurement — using ccusage session --breakdown we isolated the "ats-map" project session ($81.77) from other work done on the same day (total day: $162.99 across 4 sessions).

Source: npx ccusage@latest session --since 20260513 --until 20260513 --breakdown with LiteLLM pricing (2,707 models). Model: 100% Claude Opus 4.6 for this session. If using a Claude Max subscription, API metering does not apply.

Cost Comparison: Alternative Approaches

How does the current Claude + Playwright MCP approach compare to alternatives? Four scenarios evaluated.

Current Approach

Claude Opus 4.6 + Playwright MCP

$81.77/full run
120.7M tokens (97.9% cache reads) • ~400 tool calls • 3 sessions ~150 min execution No additional infra cost
✓ Adaptive reasoning, catches unexpected issues
✓ Minimal upfront engineering
✗ Higher per-run cost than coded E2E
Variant A

Figma MCP — Live Design Reads

~$130–180/run
~180–220M tokens • ~500+ tool calls ~200–250 min execution +$25/mo Figma Dev seat
✓ Pixel-perfect design comparison each run
✗ ~1.5–2x cost increase over current ($82)
✗ Figma API responses add 1–3M uncached tokens at $15/1M
✗ More turns = more output tokens (the expensive category)
✗ Figma API rate limits may cause failures
Variant C

Playwright E2E (Claude-Generated)

~$0.01–0.05/run
0 tokens • 0 API calls (per run) ~5–10 min execution One-time scaffold: ~$5–6 API cost (~210K tokens) 109 tests • 47 files • full POM architecture + manual flakiness tuning: ~4–8 hrs
✓ Near-zero marginal cost, fast, deterministic
✓ Framework generated by Claude in one session
✗ Rigid — no adaptive reasoning
✗ Maintenance burden when UI changes
DimensionCurrentA: Figma MCPB: Cached MapC: Coded E2E
Per-Run Cost (full deep)$81.77$120–165$55–70$0.01–0.05
Tokens / Run120.7M~160–200M~85–100M0
Execution Time~150 min~200–250 min~110–130 min~5–10 min
Upfront Engineering~2 hrs~6 hrsDone~$6 + 4–8 hrs tuning
Design AwarenessNoneFull (live)Cached snapshotsNone
Adaptive ReasoningYesYesYesNo
Maintenance on UI ChangeLowLowMedium (regen cache)High (rewrite selectors)

Recommendation

Variant B (Cached UI/UX Context Map) is ready to deploy immediately. The UI/UX context map is already built, so the projected saving of ~$10–25 vs the measured $81.77 starts from the next regression — with zero remaining upfront investment. The saving comes primarily from fewer Playwright exploration turns, which reduces cache writes (51% of cost) and output tokens (32% of cost).

Variant C is now far more accessible than traditional E2E: Claude generated 109 tests across 47 files (full POM, GraphQL client, PrimeNG helpers, fixtures, CI pipeline) for ~$5–6 in a single session — only manual flakiness tuning remains (~4–8 hrs). A hybrid strategy — coded E2E for stable critical paths + Claude MCP with cached design context for exploratory and newly changed features — would minimize both cost and risk.

Note: Only the "Current Approach" cost ($90.17) is measured from actual usage via ccusage. Variants A, B, and C are projections based on the observed cost structure (97.8% cache reads, output tokens as the primary cost driver). Actual costs may vary.

ATS-653: isDuplicated Flag Test Results

Test CaseDescriptionResult
TC1Create application with duplicate email — no error, both flaggedPASS
TC2Update email to duplicate — flag appears; change to unique — flag clearsPASS
TC3Duplicate application into same vacancy — permitted, both flaggedPASS
TC4Move duplicated application to different vacancy — flag clears on both sidesPASS

Acceptance Criteria Verification

TestResult
Creating/updating application with duplicate email no longer returns errorPASS
isDuplicated flag set to true when duplicate email exists in same vacancyPASS
isDuplicated flag set to false when no duplicate email in vacancyPASS
Flag updates correctly after create, update, duplicate, and vacancy movePASS
Duplicating application into its own vacancy is allowedPASS

TC1 — Application Creation with Duplicate Email

Created "TC1 Duplicate Email Test" in C# Dev vacancy with email jane.doe.regression.test.20260513@example.com (same as existing "Jane Doe Regression Test"). No error occurred. Both applications display the duplicate indicator.

TC1: Both applications flagged as duplicates
TC1: Both applications flagged as duplicates

TC2 — Update Recalculates Flag

Changed email to unique value — flags cleared. Changed back to duplicate — backend confirmed isDuplicated: true but icons required page reload (Angular change-detection timing).

TC2: Flags cleared after unique email
TC2: Flags cleared after unique email
TC2: API correct but icons missing (pre-reload)
TC2: API correct but icons missing (pre-reload)
TC2: Flags reappeared after reload
TC2: Flags reappeared after reload

TC3 — Duplicate into Same Vacancy

Duplicated Marcus J. Heller into same vacancy. Count increased 9→10. Both rows flagged; non-duplicates correctly unflagged.

TC3: Duplicate into same vacancy
TC3: Duplicate into same vacancy

TC4 — Vacancy Move Clears Flag

Moved one Marcus J. Heller to different vacancy. Source: flag cleared. Destination: no flag (unique email there).

TC4: Source — flag cleared
TC4: Source — flag cleared
TC4: Destination — no flag
TC4: Destination — no flag

Full Regression Test Results

1. Authentication

TestResult
Login via OpenID Connect (email + password)PASS
Redirect to staging app after authPASS
#01 Landing page
#01 Landing page

2. Vacancy List Page

TestResult
Vacancy list loads with all vacanciesPASS
Status tabs (Approved, Pending approval, Rejected)PASS
Vacancy cards show: status, visibility, name, applicant count, datePASS
Search/filter functionalityPASS
Mixed statuses visible: Open, Closed, Draft, On holdPASS
#02 Filter: Open
#02 Filter: Open
#03 Filter: Closed
#03 Filter: Closed
#04 Filter: Draft
#04 Filter: Draft
#05 Filter: On hold
#05 Filter: On hold
#06 Tab: Pending
#06 Tab: Pending
#07 Tab: Rejected
#07 Tab: Rejected

3. Vacancy Detail — All 6 Tabs

TestResult
Details tab (title, description, location, dates)PASS
Hiring team tabPASS
Vacancy advert tabPASS
Applicants tab (table with name, date, stage, kebab)PASS
Pipeline tab (stages visualization)PASS
Interview tab (AMI configuration)PASS
#09 Details
#09 Details
#10 Hiring team
#10 Hiring team
#11 Advert
#11 Advert
#12 Applicants
#12 Applicants
#13 Pipeline
#13 Pipeline
#14 Interview
#14 Interview

4. Vacancy Creation Wizard (4 Steps)

TestResult
Step 1: Vacancy details formPASS
Step 2: Pipeline & stagesPASS
Step 3: Vacancy advert (AI assistant)PASS
Step 4: Hiring teamPASS
Cancel triggers discard dialogPASS
#15 Step 1
#15 Step 1
#16 Step 2
#16 Step 2
#17 Step 3
#17 Step 3
#18 Step 4
#18 Step 4

5. Applicant List Page

TestResult
Global applicants list loadsPASS
Table columns: name, vacancy, date, stagePASS
Search functionalityPASS
#19 Applicant list
#19 Applicant list
#20 Search
#20 Search

6. Add Applicant with CV Upload + AI Mapping

TestResult
Add applicant form loadsPASS
Vacancy dropdown selectionPASS
CV upload — .txt rejected with "Invalid file type"PASS
Manual form fill (name, email, LinkedIn, location)PASS
LinkedIn URL validation (requires https://)PASS
Application created successfullyPASS
#21 Add applicant
#21 Add applicant
#22 CV upload
#22 CV upload
#23 Form filled
#23 Form filled
#24 Created
#24 Created

7. Applicant Detail — All 4 Tabs

TestResult
Application form tab (all fields editable)PASS
Documents tabPASS
Interview tabPASS
Comments tab (add/view comments)PASS
#25 Form
#25 Form
#26 Documents
#26 Documents
#27 Interview
#27 Interview
#28 Comments
#28 Comments
#29 Comment added
#29 Comment added

8. Applicant Kebab Menu Actions

TestResult
Kebab menu opens on applicant rowPASS
"Move to a vacancy" option presentPASS
"Duplicate" option presentPASS
"Delete" option presentPASS
Move action works correctlyPASS
Duplicate action works correctlyPASS
#41 Applicant kebab
#41 Applicant kebab

9. Vacancy Kebab Menu & Status Changes

TestResult
Kebab menu: Summarise, Edit, Copy, Copy vacancy link, End publishing, DeletePASS
Status dropdown: Open, Closed, On hold, DraftPASS
#39 Vacancy kebab
#39 Vacancy kebab
#40 Status dropdown
#40 Status dropdown

10. Settings — Pipelines and Stages

TestResult
Pipelines list loadsPASS
Pipeline stages visiblePASS
Stage configuration accessiblePASS
#30 Pipelines
#30 Pipelines
#31 Stages
#31 Stages

11. Settings — RSS Feeds

TestResult
External RSS Feed link with Copy buttonPASS
Internal RSS Feed link with Copy buttonPASS
Feed URLs correctly formattedPASS
#32 RSS Feeds
#32 RSS Feeds

12. External Vacancy Board (Public)

TestResult
External board loads at /external/{orgId}PASS
Vacancy listings with status, title, datePASS
Vacancy detail page with Overview tabPASS
"Start application with AMI" button functionalPASS
AMI interview page launches correctlyPASS
#33 External board
#33 External board
#34 Detail
#34 Detail
#35 AMI interview
#35 AMI interview

13. Internal Vacancy Board

TestResult
Internal board loads at /internal/{orgId}PASS
Internal vacancies displayed (2 found)PASS
Vacancy detail with Overview tab and AMI buttonPASS
#36 Internal board
#36 Internal board
#37 Detail
#37 Detail

14. User Account Menu

TestResult
User avatar shows "PW Paul Wagner"PASS
Help link (support.sense.hr)PASS
Log out option presentPASS
#38 User menu
#38 User menu

Deep Regression Results (Sessions 2 & 3)

Additional 14 test areas covering features not exercised in the initial regression: pipeline stage movement, hire/reject flows, document management, vacancy clone/edit/status lifecycle, column filters, bulk actions, AI summarise, stage CRUD, AMI end-to-end interview, and interview question management.

15. Pipeline Stage Movement (All 10 Stages)

TestResult
Move applicant Applied → Short listing via GraphQL APIPASS
Move through Assessment updated, Phone Screening, Interview, Background Checks, OfferPASS
Move to Talent pool (holding stage)PASS
Pipeline view reflects correct stage countsPASS
Pipeline initial state
Pipeline initial state
After API move to Short listing
After API move to Short listing
Talent pool with applicant
Talent pool with applicant

16. Applicant Hire Flow

TestResult
updateApplicationStatus(HIRED) with startDate via GraphQLPASS
Hired applicant appears in Hired pipeline columnPASS
getApplicantData returns vacancy/department/employment data for Sense HR exportPASS
Denis TDR in Hired stage
Denis TDR in Hired stage
Hire dialog with date picker
Hire dialog with date picker

17. Applicant Reject Flow

TestResult
updateApplicationStatus(REJECTED) with reason + notes via GraphQLPASS
Rejected applicant appears in Rejected pipeline columnPASS
Reject reason persisted correctlyPASS
Rejected with reason
Rejected with reason
Rejected column visible
Rejected column visible

18. Document Management

TestResult
Documents tab shows existing documentsPASS
Upload PDF via Add buttonPASS
Uploaded document appears in table with correct metadataPASS
Existing documents
Existing documents
After upload
After upload

19. Vacancy Clone Wizard

TestResult
Clone wizard opens with 4 stepsPASS
Step 1: Details pre-filled from source vacancyPASS
Step 2: Pipeline NOT pre-filled (must re-select)PASS
Step 3: Hiring team pre-filledPASS
Step 4: Advert pre-filledPASS
Clone Step 1
Clone Step 1
Clone Step 2 (empty pipeline)
Clone Step 2 (empty pipeline)
Clone Step 3
Clone Step 3
Clone Step 4
Clone Step 4

20. Vacancy Status Lifecycle

TestResult
Change status Open → Closed via dropdownPASS
Status persists after navigationPASS
Change status Closed → OpenPASS
Status changed to Closed
Status changed to Closed

21. Column Filters

TestResult
Pipeline stage filter popup opens with multi-selectPASS
Filter to "Hired" shows 1 resultPASS
Clear filter restores all resultsPASS
Filter popup
Filter popup
Filtered to Hired only
Filtered to Hired only

22. Bulk Actions

TestResult
Select All checkbox selects all visible applicantsPASS
Bulk toolbar appears with countPASS
All applicants selected
All applicants selected

23. AI Vacancy Summarise

TestResult
Summarise generates AI summary with key requirementsPASS
Summary includes skills, role overview, and structurePASS
AI summary output
AI summary output

24. Vacancy Edit Wizard

TestResult
Edit wizard opens with 3 steps (no pipeline step)PASS
Fields pre-populated from existing vacancyPASS
Edit wizard (3 steps)
Edit wizard (3 steps)

25. End Publishing

TestResult
End publishing confirmation dialog shownPASS
Dialog contains typo: "this vacancies" (minor)PASS
Vacancy removed from external board after unpublishingPASS
End publishing dialog
End publishing dialog

26. Settings — Pipelines & Stages (Deep)

TestResult
Pipelines page loads with 5 pipelinesPASS
Stages page loads with 33 stagesPASS
Create stage dialog: Name, Description, Color pickerPASS
Pipelines
Pipelines
Stages (33)
Stages (33)
Create stage dialog
Create stage dialog

27. AMI Full Application Flow (End-to-End)

TestResult
External board → Start application with AMIPASS
AMI registration: name, email, phone, LinkedIn, locationPASS
AMI asks 3 interview questions + 1 optionalPASS
AMI shows answer summary table for reviewPASS
AMI saves interview answers successfullyPASS
Application appears in ATS admin (8 applicants, was 7)PASS
Interview tab shows 96% match, ranking 4/4PASS
All registration data (name, email, LinkedIn, location) saved correctlyPASS
External board
External board
AMI welcome
AMI welcome
First question
First question
Answer summary
Answer summary
Interview complete
Interview complete
Applicant in ATS admin
Applicant in ATS admin
Interview scores 96%
Interview scores 96%
Registration data saved
Registration data saved

28. Interview Question Management

TestResult
Edit link opens AMI admin interfacePASS
Preview button shows question configuration dialogPASS
Questions show: text, category, required/optional, acceptance criteria, accepted answerPASS
Category weight distribution displayed (11 categories)PASS
AMI admin
AMI admin
Question preview dialog
Question preview dialog

Observations

1. Angular Change Detection Timing (Minor)

After updating an application's email (TC2), the isDuplicated icon may not render until a page reload. The GraphQL API returns the correct value immediately. Not a data bug.

2. End Publishing Dialog Typo (Minor)

The confirmation dialog reads "Are you sure you want to end publishing for this vacancies?" — should be "this vacancy" (singular). Grammar bug only.

3. Clone Wizard Does Not Pre-fill Pipeline (UX)

When cloning a vacancy, Step 2 (Vacancy Setup) does not pre-select the source vacancy's pipeline. The user must re-select it manually. All other steps (Details, Hiring team, Advert) are correctly pre-filled.

4. Edit Wizard Has 3 Steps (Not 4)

The vacancy edit wizard has 3 steps (Vacancy Details, Hiring team, Vacancy Advert) — no pipeline step. Pipeline changes must be made via the Pipeline tab's "Edit pipeline" button. This is by design since changing a pipeline after applicants are assigned could break stage mappings.

5. CDK Drag-and-Drop Unreliable for Automation

Angular CDK drag-and-drop uses dynamically numbered cdk-drop-list-N IDs that change on navigation. Raw mouse events for drag simulation are unreliable. Direct GraphQL updateApplicationStage mutation is the reliable automation path for pipeline stage movement testing.

6. AMI Interview Scoring

AMI automatically scores interview answers across 11 categories with weighted percentages. The test applicant received 96% match score, 96% answer quality, with 100% on Experience, Team Skills, and Problem Solving categories. Scoring is AI-generated and non-deterministic.

7. Name Collisions Are Not Flagged

Same-name pairs with different emails correctly show no duplicate indicator. The isDuplicated check is email-based per ATS-653 spec.

Conclusion

The Sense ATS staging build passes full + deep regression testing. All 30 test areas across 3 sessions function correctly. This includes:

  • Core features: vacancies, applicants, pipelines, settings, boards, user account (14 areas)
  • ATS-653: isDuplicated flag works correctly across all 4 test scenarios (4 areas)
  • Deep regression: pipeline stage movement (all 10 stages), hire/reject flows, document management, vacancy clone/edit/status lifecycle, column filters, bulk actions, AI summarise, stage CRUD, AMI end-to-end interview, interview question management (14 areas)

Minor observations (non-blocking):

  • Angular change-detection timing for isDuplicated icon (TC2) — UI only, data correct
  • End publishing dialog grammar typo ("this vacancies")
  • Clone wizard does not pre-fill pipeline in Step 2

Recommendation: ATS-653 is ready to move from "Ready for QA" to "Done". No blocking issues found in any feature area.