Browse Source

Add batch pacman checks and package resolution specs

master
AI Bot 4 days ago committed by Riyyi
parent
commit
2a90c0cd11
  1. 2
      openspec/changes/batch-pacman-checks/.openspec.yaml
  2. 137
      openspec/changes/batch-pacman-checks/design.md
  3. 35
      openspec/changes/batch-pacman-checks/proposal.md
  4. 72
      openspec/changes/batch-pacman-checks/specs/batch-package-resolution/spec.md
  5. 28
      openspec/changes/batch-pacman-checks/specs/dry-run-simulation/spec.md
  6. 51
      openspec/changes/batch-pacman-checks/tasks.md

2
openspec/changes/batch-pacman-checks/.openspec.yaml

@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-04-14

137
openspec/changes/batch-pacman-checks/design.md

@ -0,0 +1,137 @@
## Context
Currently, `pkg/pacman/pacman.go` uses subprocess calls to query pacman for package existence:
- `pacman -Qip <pkg>` to check local DB (per package)
- `pacman -Sip <pkg>` to check sync repos (per package)
For n packages, this spawns 2n subprocesses (up to ~300 for typical package lists). Each subprocess has fork/exec overhead, making this the primary performance bottleneck.
The AUR queries are already batched (single HTTP POST with all package names), which is the desired pattern.
## Goals / Non-Goals
**Goals:**
- Eliminate subprocess overhead for local/sync DB package lookups
- Maintain batched AUR HTTP calls (single request per batch)
- Track installed status per package in PackageInfo
- Provide dry-run output showing exact packages to install/remove
- Handle orphan cleanup correctly
**Non-Goals:**
- Parallel AUR builds (still sequential)
- Custom pacman transaction handling (use system pacman)
- Repository configuration changes
- Package download/compile optimization
## Decisions
### 1. Use Jguer/dyalpm for DB access
**Decision**: Use `github.com/Jguer/dyalpm` library instead of spawning subprocesses.
**Rationale**:
- Direct libalpm access (same backend as pacman)
- Already Go-native with proper type safety
- Supports batch operations via `GetPkgCache()` and `PkgCache()` iterators
**Alternatives considered**:
- Parse `pacman -Qs` output - fragile, still subprocess-based
- Write custom libalpm bindings - unnecessary effort
### 2. Single-pass package resolution algorithm
**Decision**: Process all packages through local DB → sync DBs → AUR in one pass.
```
For each package in collected state:
1. Check local DB (batch lookup) → if found, mark Installed=true
2. If not local, check all sync DBs (batch lookup per repo)
3. If not in sync, append to AUR batch
Batch query AUR with all remaining packages
Throw error if any package not found in local/sync/AUR
Collect installed status from local DB
(Perform sync operations - skip in dry-run)
(Mark ALL currently installed packages as deps - skip in dry-run)
(Then mark collected state packages as explicit - skip in dry-run)
(Cleanup orphans - skip in dry-run)
Output summary
```
**Rationale**:
- Single iteration over packages
- Batch DB lookups minimize libalpm calls
- Clear error handling for missing packages
- Consistent with existing behavior
### 3. Batch local/sync DB lookup implementation
**Decision**: For local DB, iterate `localDB.PkgCache()` once and build a map. For sync DBs, iterate each repo's `PkgCache()`.
**Implementation**:
```go
// Build local package map in one pass
localPkgs := make(map[string]bool)
localDB.PkgCache().ForEach(func(pkg alpm.Package) error {
localPkgs[pkg.Name()] = true
return nil
})
// Similarly for each sync DB
for _, syncDB := range syncDBs {
syncDB.PkgCache().ForEach(...)
}
```
**Rationale**:
- O(n) iteration where n = total packages in DB (not n queries)
- Single map construction, O(1) lookups per state package
- libalpm iterators are already lazy, no additional overhead
### 4. Dry-run behavior
**Decision**: Dry-run outputs exact packages that would be installed/removed without making any system changes.
**Implementation**:
- Skip `pacman -Syu` call
- Skip `pacman -D --asdeps` (mark all installed as deps)
- Skip `pacman -D --asexplicit` (mark state packages as explicit)
- Skip `pacman -Rns` orphan cleanup
- Still compute what WOULD happen for output
**Note on marking strategy**:
Instead of diffing between before/after installed packages, we simply:
1. After sync completes, run `pacman -D --asdeps` on ALL currently installed packages (this marks everything as deps)
2. Then run `pacman -D --asexplicit` on the collected state packages (this overrides them to explicit)
This is simpler and achieves the same result.
## Risks / Trade-offs
1. **[Risk]** dyalpm initialization requires root privileges
- **[Mitigation]** This is same as pacman itself; if user can't run pacman, declpac won't work
2. **[Risk]** libalpm state becomes stale if another pacman instance runs concurrently
- **[Mitigation]** Use proper locking, rely on pacman's own locking mechanism
3. **[Risk]** AUR packages still built sequentially
- **[Acceptable]** Parallel AUR builds out of scope for this change
4. **[Risk]** Memory usage for large package lists
- **[Mitigation]** Package map is ~100 bytes per package; 10k packages = ~1MB
## Migration Plan
1. Add `github.com/Jguer/dyalpm` to go.mod
2. Refactor `ValidatePackage()` to use dyalpm instead of subprocesses
3. Add `Installed bool` to `PackageInfo` struct
4. Implement new resolution algorithm in `categorizePackages()`
5. Update `Sync()` and `DryRun()` to use new algorithm
6. Test with various package combinations
7. Verify output matches previous behavior
## Open Questions
- **Q**: Should we also use dyalpm for `GetInstalledPackages()`?
- **A**: Yes, can use localDB.PkgCache().Collect() or iterate - aligns with overall approach

35
openspec/changes/batch-pacman-checks/proposal.md

@ -0,0 +1,35 @@
## Why
The current pacman implementation spawns multiple subprocesses per package (pacman -Qip, pacman -Sip) to check if packages exist in local/sync DBs or AUR. With many packages, this creates significant overhead. Using the Jguer/dyalpm library provides direct libalpm access for batch queries, eliminating subprocess overhead while maintaining the batched AUR HTTP calls.
## What Changes
- **Add dyalpm dependency**: Integrate Jguer/dyalpm library for direct libalpm access
- **Batch local DB check**: Use `localDB.PkgCache()` to check all packages at once instead of per-package `pacman -Qip`
- **Batch sync DB check**: Use `syncDBs[i].PkgCache()` to check all sync repos at once instead of per-package `pacman -Sip`
- **Enhance PackageInfo**: Add `Installed bool` field to track if package is already installed
- **New algorithm**: Implement unified package resolution flow:
1. Batch check local DB for all packages
2. Batch check sync DBs for remaining packages
3. Batch query AUR for non-found packages
4. Track installed status throughout
5. Perform sync operations with proper marking
6. Output summary of changes
## Capabilities
### New Capabilities
- `batch-package-resolution`: Unified algorithm that batch-resolves packages from local DB → sync DBs → AUR with proper installed tracking
- `dry-run-simulation`: Shows exact packages that would be installed/removed without making changes
### Modified Capabilities
- None - this is a pure optimization with no behavior changes visible to users
## Impact
- **Code**: `pkg/pacman/pacman.go` - refactored to use dyalpm
- **Dependencies**: Add Jguer/dyalpm to go.mod
- **APIs**: `ValidatePackage()` signature changes (returns installed status)
- **Performance**: O(n) subprocess calls → O(1) for local/sync DB checks

72
openspec/changes/batch-pacman-checks/specs/batch-package-resolution/spec.md

@ -0,0 +1,72 @@
## ADDED Requirements
### Requirement: Batch package resolution from local, sync, and AUR databases
The system SHALL resolve packages in a single pass through local DB → sync DBs → AUR using batch operations to minimize subprocess/API calls.
#### Scenario: Package exists in local DB
- **WHEN** a package from collected state exists in the local database
- **THEN** the system SHALL mark it as found, set `Installed=true`, and exclude it from AUR queries
#### Scenario: Package exists in sync DB
- **WHEN** a package from collected state does NOT exist in local DB but exists in ANY enabled sync database
- **THEN** the system SHALL mark it as found, set `Installed=false`, and exclude it from AUR queries
#### Scenario: Package exists only in AUR
- **WHEN** a package from collected state does NOT exist in local or sync databases but exists in AUR
- **THEN** the system SHALL mark it as found with `InAUR=true`, set `Installed=false`, and use the cached AUR info
#### Scenario: Package not found anywhere
- **WHEN** a package from collected state is NOT in local DB, NOT in any sync DB, and NOT in AUR
- **THEN** the system SHALL return an error listing the unfound package(s)
#### Scenario: Batch AUR query
- **WHEN** multiple packages need AUR lookup
- **THEN** the system SHALL make a SINGLE HTTP request to AUR RPC with all package names (existing behavior preserved)
### Requirement: Efficient local DB lookup using dyalpm
The system SHALL use dyalpm's `PkgCache()` iterator to build a lookup map in O(n) time, where n is total packages in local DB, instead of O(n*m) subprocess calls.
#### Scenario: Build local package map
- **WHEN** initializing package resolution
- **THEN** the system SHALL iterate localDB.PkgCache() once and store all package names in a map for O(1) lookups
#### Scenario: Check package in local map
- **WHEN** checking if a package exists in local DB
- **THEN** the system SHALL perform an O(1) map lookup instead of spawning a subprocess
### Requirement: Efficient sync DB lookup using dyalpm
The system SHALL use each sync DB's `PkgCache()` iterator to check packages across all enabled repositories.
#### Scenario: Check package in sync DBs
- **WHEN** a package is not found in local DB
- **THEN** the system SHALL check all enabled sync databases using their iterators
#### Scenario: Package found in multiple sync repos
- **WHEN** a package exists in more than one sync repository (e.g., core and community)
- **THEN** the system SHALL use the first match found
### Requirement: Track installed status in PackageInfo
The system SHALL include an `Installed bool` field in `PackageInfo` to indicate whether the package is currently installed.
#### Scenario: Package is installed
- **WHEN** a package exists in the local database
- **THEN** `PackageInfo.Installed` SHALL be `true`
#### Scenario: Package is not installed
- **WHEN** a package exists only in sync DB or AUR (not in local DB)
- **THEN** `PackageInfo.Installed` SHALL be `false`
### Requirement: Mark installed packages as deps, then state packages as explicit
After package sync completes, the system SHALL mark all installed packages as dependencies, then override the collected state packages to be explicit. This avoids diffing before/after states.
#### Scenario: Mark all installed as deps
- **WHEN** package sync has completed (non-dry-run)
- **THEN** the system SHALL run `pacman -D --asdeps` to mark ALL currently installed packages as dependencies
#### Scenario: Override state packages to explicit
- **WHEN** all installed packages have been marked as deps
- **THEN** the system SHALL run `pacman -D --asexplicit` on the collected state packages, overriding their dependency status
#### Scenario: Dry-run skips marking
- **WHEN** operating in dry-run mode
- **THEN** the system SHALL NOT execute any `pacman -D` marking operations

28
openspec/changes/batch-pacman-checks/specs/dry-run-simulation/spec.md

@ -0,0 +1,28 @@
## ADDED Requirements
### Requirement: Dry-run shows packages to install without making changes
In dry-run mode, the system SHALL compute what WOULD happen without executing any pacman operations.
#### Scenario: Dry-run lists packages to install
- **WHEN** dry-run is enabled and packages need to be installed
- **THEN** the system SHALL populate `Result.ToInstall` with all packages that would be installed (both sync and AUR)
#### Scenario: Dry-run lists packages to remove
- **WHEN** dry-run is enabled and orphan packages exist
- **THEN** the system SHALL NOT calculate or populate `Result.ToRemove` - orphan detection is skipped entirely in dry-run mode
#### Scenario: Dry-run skips pacman sync
- **WHEN** dry-run is enabled
- **THEN** the system SHALL NOT execute `pacman -Syu` for package installation
#### Scenario: Dry-run skips explicit/deps marking
- **WHEN** dry-run is enabled
- **THEN** the system SHALL NOT execute `pacman -D --asdeps` or `pacman -D --asexplicit`
#### Scenario: Dry-run skips orphan cleanup
- **WHEN** dry-run is enabled
- **THEN** the system SHALL NOT execute `pacman -Rns` for orphan removal
#### Scenario: Dry-run outputs count summary
- **WHEN** dry-run is enabled
- **THEN** the system SHALL still compute and output `Result.Installed` and `Result.Removed` counts as if the operations had run

51
openspec/changes/batch-pacman-checks/tasks.md

@ -0,0 +1,51 @@
## 1. Setup
- [ ] 1.1 Add `github.com/Jguer/dyalpm` to go.mod
- [ ] 1.2 Run `go mod tidy` to fetch dependencies
## 2. Core Refactoring
- [ ] 2.1 Update `PackageInfo` struct to add `Installed bool` field
- [ ] 2.2 Create `Pac` struct with `alpm.Handle` instead of just aurCache
- [ ] 2.3 Implement `NewPac()` that initializes alpm handle and local/sync DBs
## 3. Package Resolution Algorithm
- [ ] 3.1 Implement `buildLocalPkgMap()` - iterate localDB.PkgCache() to create lookup map
- [ ] 3.2 Implement `checkSyncDBs()` - iterate each sync DB's PkgCache() to find packages
- [ ] 3.3 Implement `resolvePackages()` - unified algorithm:
- Step 1: Check local DB for all packages (batch)
- Step 2: Check sync DBs for remaining packages (batch per repo)
- Step 3: Batch query AUR for remaining packages
- Step 4: Return error if any package unfound
- Step 5: Track installed status from local DB
## 4. Sync and DryRun Integration
- [ ] 4.1 Refactor `Sync()` function to use new resolution algorithm
- [ ] 4.2 Refactor `DryRun()` function to use new resolution algorithm
- [ ] 4.3 Preserve AUR batched HTTP calls (existing `fetchAURInfo`)
- [ ] 4.4 Preserve orphan cleanup logic (`CleanupOrphans()`)
## 5. Marking Operations
- [ ] 5.1 Keep `MarkExplicit()` for marking state packages
- [ ] 5.2 After sync, run `pacman -D --asdeps` on ALL installed packages (simplifies tracking)
- [ ] 5.3 After deps marking, run `pacman -D --asexplicit` on collected state packages (overrides deps)
- [ ] 5.4 Skip marking operations in dry-run mode
## 6. Cleanup and Output
- [ ] 6.1 Remove subprocess-based `ValidatePackage()` implementation
- [ ] 6.2 Remove subprocess-based `GetInstalledPackages()` implementation
- [ ] 6.3 Update output summary to show installed/removed counts
- [ ] 6.4 In dry-run mode, populate `ToInstall` and `ToRemove` lists
## 7. Testing
- [ ] 7.1 Test with packages in local DB only
- [ ] 7.2 Test with packages in sync DBs only
- [ ] 7.3 Test with AUR packages
- [ ] 7.4 Test with missing packages (should error)
- [ ] 7.5 Test dry-run mode output
- [ ] 7.6 Test orphan detection and cleanup
Loading…
Cancel
Save