Vernon Chalmers Photography: The Future of Canon EOS R AF Systems

The Future of Canon AF Systems beyond the EOS R1 and EOS R5 Mark II — Deep Technical Analysis

Executive summary

"Canon’s EOS R1 and EOS R5 Mark II represent two peaks of the company’s recent mirrorless AF engineering: the R1 as a thermally engineered, pro-level implementation of advanced Dual Pixel AF with expanded cross-type detection and sport/bird optimizations; the R5 Mark II as a more general-purpose high-resolution, high-compute body. Moving beyond these platforms requires integrated advances across sensor architecture, on-device computation, lens actuation & telemetry, and probabilistic/perceptual AF pipelines.

The next generation of Canon AF will be shaped by four central thrusts:

Sensor-level innovation — denser, multi-directional phase detection, stacked/BSI readout architectures, and optionally spectrally or polarization-sensitive AF pixels to disambiguate hard cases. (Canon Global)

On-device neural compute — dedicated neural accelerators (either integrated into future DIGIC platforms or as discrete co-processors) to run heavier detection and pose networks at low latency. Industry trends (e.g., intelligent vision sensors with on-chip inference) show the technical feasibility and practical benefits. (Sony Semiconductor)

Lens–body cooperative control — richer RF mount telemetry and closed-loop actuation using lens-embedded sensors and adaptive motor control to remove physical execution uncertainty. The RF protocol already increases bandwidth versus EF; next steps will standardize richer telemetry. (Canon Europe)

Probabilistic, multi-stage AF algorithms — hybrid detection + tracking pipelines that fuse visual detections, IMU data, lens telemetry, and explicit motion priors (e.g., bird flight dynamics) with Kalman / particle filtering and learned motion models for robust occlusion handling and prediction.

This paper explains the engineering rationale, describes concrete architectures and algorithms, highlights implementation constraints (thermal, power, backward-compatibility), and provides a roadmap for near- to mid-term product cycles and research directions. Where possible I anchor claims in product or academic references. (Canon U.S.A.)

Background: Canon’s Dual Pixel tradition and the R1 / R5 Mark II baseline

1.1. Dual Pixel CMOS AF, its strengths, and limitations

Canon’s Dual Pixel CMOS AF (DPAF) is a phase-detection approach implemented at the imaging pixel level: each imaging pixel is split into two photodiodes that provide phase information without requiring separate PDAF pixels or a mirror mechanism. This allows dense phase detection across much of the imaging sensor while still capturing image irradiance on the same pixel array (i.e., it’s not a separate AF sensor). DPAF’s strengths include smooth, low-hunting AF transitions, dense field coverage for semantic detection, and excellent video AF performance because the AF sensor and imaging sensor are the same. These properties are the foundation for Canon’s modern AF performance. (Canon U.S.A.)

However, DPAF historically had directionality limits (many early implementations measured primarily vertical line displacement), and under certain textures — e.g., subjects with few vertical features, or scenes with repetitive vertical patterns — it could misacquire the wrong surface. Canon’s R1 addressed this by supporting rotated pupil division (effectively cross-type/bi-directional PD detection), enabling horizontal as well as vertical PD sensing in the same sensor. This cross-type capability materially reduces certain failure modes (e.g., birds with extended wings, mesh occlusions). (Canon U.S.A.)

1.2. What the R1 and R5 Mark II leave unsolved

The R1 shows how far DPAF can scale in a thermally-engineered flagship, and the R5 Mark II provides a complementary approach balancing resolution and speed. But practical failure modes remain:

Occlusion and distractor problems: when the intended subject is partially occluded by foreground objects or when multiple similar objects are present, simple per-frame PD measurements can latch to a distractor.

Rapid, non-linear motion: sudden accelerations (e.g., birds changing direction) create prediction burdens that pure reactive AF struggles to meet because of body+lens actuation latency.

Low-contrast or textureless scenes: phase information may be weak for low-contrast textures or transparent surfaces.

Addressing these requires combining better sensing (more robust PD measurements, additional modalities), richer compute (learned detection/identity and predictive models), and more precise actuation. The rest of this paper explores the technical steps necessary for that integration. (Canon Georgia)

2. Sensor architecture: beyond denser PD — multi-modal on-sensor AF

Sensor evolution is the most foundational hardware lever. Improvements in pixels and readout can reduce latency and increase robustness cheaply compared to full optical or mechanical redesign.

2.1. Multi-directional PD and cross-type pixels

The R1’s approach to rotate pupil-division to detect horizontal PD in addition to vertical PD demonstrates a path: pixel designs that support multiple phase-split orientations (vertical, horizontal, diagonal) either by programmable micro-optics or by interleaving multiple pixel types across the array. Interleaving supports per-region orientation diversity and reduces the chance of uniform failure modes across the frame.

Design trade-offs:

Fill factor vs. PD capability: more complex pixel microstructure can reduce fill factor and SNR. Engineering must balance photodiode area, micro-lens geometry, and readout noise.

Calibration complexity: multiple PD orientations require per-pixel calibration of phase offsets and angular sensitivity; this increases factory calibration steps and possibly on-field auto-calibration routines.

Academic work on multi-phase pixels and multi-scale PD (Jang et al., 2015) shows robust AF using pixels with different phases, supporting the feasibility of such designs. (PMC)

2.2. Stacked sensors, on-die memory, and readout latency

Stacked CMOS sensors (BSI + stacked logic and memory) dramatically reduce the latency between pixel exposure and access by placing memory and logic adjacent to the pixel array. This reduces the time between image formation and AF decision, which is crucial for high-speed tracking where even a few milliseconds matter.

Benefits include:

Lower effective AF latency: faster DMA of sensor telemetry to ISP/AI unit.

Higher frame rates with continuous AF telemetry: sensors can provide partial readouts dedicated to AF (telemetry windows) while simultaneously outputting image frames. Recent industry moves to “intelligent” stacked sensors with local processing make it feasible to perform some AF pre-processing on-chip. Sony’s IMX500 family demonstrates on-chip AI paradigms in practice. (Sony Semiconductor)

2.3. Specialized AF pixel modalities (spectral, polarization, TOF assist)

Hard cases where visual texture is ambiguous (e.g., birds behind foliage or against sky) can benefit from additional sensing modalities:

Spectral discrimination: small sets of pixels with spectral filters (narrowband) could improve separation between subject and background (feathers vs. leaves) where RGB contrast is low.

Polarization-sensitive pixels: polarization helps separate reflections (glints) from diffuse surfaces.

Short-range depth assist (time-of-flight or structured light): a small TOF array or IR depth assist module can help disambiguate subject plane vs. foreground occluder, particularly at short ranges.

These additions add hardware complexity and power cost, but embedding small, low-power depth or polarization modules dedicated to AF telemetry — not image formation — could be a practical compromise. Research into in-sensor focus evaluation (e.g., contrast measures computed on-chip) also shows possible microsecond-scale AF evaluation loops that reduce dependency on external compute. (ScienceDirect)

3. On-device computation: neural accelerators, multi-stage pipelines, and model design

Sensor telemetry is necessary but insufficient. Modern AF improvements come from perception: identifying the intended subject, tracking identity through occlusions, and predicting motion. These tasks are computationally heavy; thus the next step is on-device neural compute.

3.1. Neural accelerators — existing examples and the case for camera integration

Edge vision sensors combining image capture and inference (Sony’s IMX500/IMX501 line and related industry efforts) show that on-image-sensor inference is practical and power-efficient for many tasks. Cameras benefit from dedicated accelerators for several reasons:

Lower latency: inference close to the data source reduces bus delays.

Power efficiency: purpose-built MAC arrays or NPU blocks can run detection/pose networks with far less energy than a general-purpose CPU.

Privacy & autonomy: on-device learning and inference avoid cloud round trips.

For Canon, integrating a dedicated NPU into future DIGIC SoCs, or adding a discrete co-processor on the mainboard, makes sense. This is already a trend in mobile devices and in some professional camera ecosystems via accessory modules or integrated silicon. Industry demos (e.g., Raspberry Pi + IMX500 AI camera) show practical developer pathways. (Sony Semiconductor)

3.2. Two-stage detection and tracker architecture — rationale and structure

A practical AF pipeline is a two-stage system:

Global detector (lightweight, high frequency) — runs every frame on a low-compute network to produce coarse detections and candidate bounding boxes for subjects of interest (people, animals, vehicles, ball, etc.). This module runs at full incoming frame rate (e.g., 60–120 Hz on modern bodies) with small networks optimized for low latency.

Per-candidate tracker + verifier (heavier network, lower frequency) — for each candidate, a heavier network computes identity embeddings, pose/keypoints, and confidence; a probabilistic tracker (Kalman / particle filter) fuses these observations with motion models and lens/IMU telemetry to predict short-term future positions.

This design balances throughput and accuracy: the detector produces candidates cheaply, the tracker invests compute where it matters (active subjects). The per-candidate stage performs model-based prediction and identity retention across occlusions.

Algorithmic details:

Detector: a tiny one-stage detector (e.g., a Micro-SSD or MobileNet-based YOLO-lite) pruned and quantized to run at ~100+ Hz on an NPU. Outputs: class, bbox, coarse orientation.

Tracker: a hybrid filter that fuses visual centroid observations, bounding box size (proxy for depth), IMU accelerations, and lens focus-position deltas. It uses a Kalman filter with adaptive process noise tuned per subject class; when multi-modal uncertainty exists, a particle filter or mixture of Kalman filters can maintain multiple hypotheses.

Re-identification/verification: a compact embedding extracted by an embedded network (e.g., a 128-D feature vector) that allows matching candidate detections to active tracks even after short occlusions.

This pipeline tolerates dropped frames or detector misses because the tracker can predict based on motion priors and IMU/actuator telemetries until the detector re-confirms. The system can also escalate compute (e.g., run a heavier pose network) when confidence drops or when the photographer explicitly requests higher fidelity (via an "excavate" button). This architecture mirrors industry best practice in robotics and autonomous vehicles and is a practical path for camera AF. (See section 6 for pseudocode and compute budgeting.) (Sony Semiconductor)

3.3. Motion priors and learned dynamics

Motion prediction improves with priors. Instead of a generic constant-velocity model, learned priors conditioned on subject class can significantly reduce prediction error:

Birds: use a dynamic model incorporating flapping periodicity and maneuvering profiles; learned state transitions can anticipate short bursts of acceleration.

Cars / cyclists: smoother motion with lane/track constraints; models can incorporate road curvature priors.

Athletes: high lateral agility, frequent stops/starts — models trained on sports footage can learn characteristic acceleration distributions.

Priors can be represented as learned transition matrices (for linearizable filters), neural nets predicting short-term trajectory deltas, or as class-conditioned covariance schedules for process noise in a Kalman filter. Training datasets drawn from annotated high-frame-rate sports and wildlife video will be needed; Canon’s customer base and pro partnerships can assist in curating such datasets. (Ethical/privacy rules apply if using customer footage; opt-in aggregation is recommended.) (Canon Georgia)

3.4. On-device learning and personalization

Allowing photographers to “teach” the camera specific subjects helps in repeatable scenarios (a racing team’s car, a particular show bird). Two practical approaches:

On-device fine-tuning: provide a small buffer and lightweight adaptation routine that updates the last layer of a verification network using a few annotated frames (few-shot learning) — executed only on the NPU to avoid long CPU cycles.

Profile sharing: photographers can export/import subject profiles between bodies (encrypted, privacy-respecting), enabling teams to preconfigure cameras for a specific event.

Make these features opt-in and ensure clear UI for when the camera is learning to avoid surprises.

4. Lens actuation and RF telemetry: closing the loop

Good perception must be matched by precise actuation. Even the best prediction fails if the lens cannot rapidly and accurately execute focus commands.

4.1. Richer lens telemetry: what to send and why

RF mount already increased pin count and bandwidth compared to EF. The next generation should formalize a lens telemetry specification that includes:

High-resolution focus position encoding (absolute) with timestamped samples.

Motor torque / motor current sensing as a proxy for friction or stalls.

Lens temperature and compliance (affects motor performance).

Inertial micro-sensors embedded in large telephoto lenses (some super-telephoto lenses already include rudimentary sensors for IS; extending to micro-IMUs provides per-lens motion estimates).

Focus group position sensors with micro-resolution (magnetic encoders or optical encoders) for closed-loop focus control.

High-fidelity, timestamped telemetry lets the body fuse actuation state into the tracker: the tracker can anticipate actuator latency, compensate for overshoot, and schedule commands that maximize the probability the lens is at the predicted focus plane when the shutter opens. Canon’s RF design provides a path to richer communications; standardizing messages and timestamps is the engineering step. (Canon Europe)

4.2. Closed-loop cooperative control

Instead of a naïve command→execute model, future bodies and lenses should run a cooperative control loop:

Body’s tracker outputs a predicted subject plane and required optical path length (i.e., target focus position).

Body sends a trajectory for the lens (time-stamped positions with soft deadlines and tolerance bands) rather than a single point command.

Lens controller executes using local feedforward + PID + friction compensation and returns state.

If the lens detects that the commanded trajectory will cause unacceptable overshoot (due to temperature or mechanical issue), it can request a negotiated change from the body or flag a suboptimal condition to the UX.

The body re-optimizes exposures and shutter timing based on lens readiness or uses exposure stacking or burst timing to capture the peak moment.

This cooperative approach reduces the uncertainty bandwidth product and lets bodies avoid repeated micro-dialing that increases hunting and wear. High-end lenses with better encoders and motors will realize more of this benefit. (The-Digital-Picture.com)

4.3. Adaptive motor control and new actuator modalities

Actuator advances will be important:

Improved USM/STM designs with faster step response, less overshoot, and built-in encoders.

Voice coil motors with active damping to reduce ringing after rapid slews.

Magneto-rheological damping or variable compliance elements in professional lenses for dynamic tuning — while complex and expensive, pro glass could adopt such technologies for maximum AF responsiveness.

Design trade-offs include cost, weight, power, and long-term reliability. For pro lenses, cost/weight trade-offs favor performance; consumer glass emphasizes cost and battery life.

5. Probabilistic AF control: filters, hypotheses, and recovery strategies

A camera’s AF controller must reason under uncertainty. Below I detail practical, implementable probabilistic algorithms and recovery modes.

5.1. Hybrid Kalman / particle filtering for short-term prediction

A Kalman filter (KF) provides an optimal linear estimator under Gaussian noise assumptions. Practical AF requires:

State vector: position (image coordinates), velocity, scale (bounding box size as inverse depth proxy), and optionally acceleration.

Observation model: detector outputs (bbox centroid + size), lens focus position mapped to subject depth (through lens calibration), IMU accelerations, and depth assist readings.

Process noise: class-conditioned and adaptive — birds have higher process noise in lateral directions.

When multi-modal uncertainty arises (e.g., multiple candidate detections similar to target), a particle filter (PF) or mixture of KFs maintains multiple hypotheses with associated weights. PFs are computationally heavier but can be constrained to the short horizon (e.g., 100–300 ms) to remain tractable.

Implementation tips:

Use an adaptive gating mechanism so that detector observations far from predicted position (beyond a class-conditioned Mahalanobis distance) are withheld to prevent identity swaps.

When the tracker’s confidence drops below a threshold (e.g., after occlusion or long miss), escalate to a re-detection routine that performs a wider search and, if possible, solicits user input (e.g., half-press focus).

Maintain a confidence score that combines detection probability, embedding similarity, and tracker uncertainty. Display this to users as an overlay and use it to schedule compute (run heavier verifier when confidence low).

KF equations and step-by-step implementation can be provided in pseudocode; see Section 9 for pseudocode and compute budgeting.

5.2. Recovery strategies and UX design

No matter how good the models, recovery is crucial:

Graceful fallbacks: if primary tracker fails, fallback to a less constrained multi-class detector with larger area search, but lower priority to avoid jumping to new distractors.

Photographer-assisted re-acquisition: small, intuitive controls (rear dial press or touch to “anchor” a subject) should allow instant reassigning of tracking identity when automatic systems fail.

Explainable feedback: indicate why the camera switched targets (e.g., “higher confidence: face detected” or “occlusion timeout”) to help pros understand and modify behavior.

UX design should enable photographers to trade automatic behavior for deterministic control — sometimes a human will want to lock focus even if AI suggests otherwise.

6. Firmware ecosystems, dataset curation, and continuous improvement

A decisive trend in contemporary camera engineering is shipping intelligence improvements via firmware and model updates.

6.1. Firmware as the upgrade path

Canon and competitors increasingly deliver AF improvements post-launch via firmware updates (improved animal detection, better subject biasing). Cameras with onboard NPUs enable model updates and new behavior without hardware replacements; this is crucial for competitive differentiation and long product life cycles. Canon’s track record of shipping meaningful AF upgrades via firmware supports this approach. (Canon U.S.A.)

6.2. Data: annotation, diversity, and ethics

Training robust detectors and motion predictors requires curated datasets:

High-frame-rate video for motion modeling (120–240 fps where possible) with accurate bounding boxes, keypoints, and occlusion flags.

Class diversity: birds across species, athletes across sports, vehicles, etc., because dynamic priors differ by subclass.

Edge cases: reflections, glass, netting, foliage — where current systems fail most frequently.

Canon should develop an opt-in data collection program that allows users to contribute anonymized telemetry and frames, with explicit consent and clear opt-out. Professional partners (sports leagues, wildlife organizations) can provide labeled corpora for domain-specific fine-tuning. Legal and ethical constraints must be enforced: no face recognition or personally identifying model training without explicit, well-documented consent. (Canon Georgia)

7. Thermal, power, and practical engineering constraints

Integrating NPUs and high-rate telemetry has costs.

7.1. Power & heat trade-offs

NPUs and stacked sensors increase power draw. Professional bodies like the R1 use magnesium and graphite heat paths to manage thermal budgets; future bodies must continue this engineering focus while balancing ergonomics. Thermal ceilings force conservative continuous inference budgets (e.g., run heavy per-candidate models sporadically, schedule full compute bursts only when battery and thermal headroom permit). Canon’s R1 thermal design decisions illustrate these tradeoffs. (Canon U.S.A.)

7.2. Backward compatibility and third-party lenses

Canon must preserve the RF mount ecosystem. New telemetry or cooperative control protocols should be versioned, with graceful fallbacks for lenses lacking advanced features. Provide clear developer documentation and firmware tools for third parties to adopt richer telemetry, encouraging ecosystem adoption.

8. Proposed system architecture (concrete design)

Below is a compact architectural design that is implementable by Canon engineering teams within a 2–3 product cycles horizon.

8.1. Hardware stack

Sensor: Stacked BSI CMOS with mixed PD pixel types (vertical/horizontal/diagonal microstructures) and an optional small TOF/polarization assist array; low-latency AF readout windows. (Canon Global)

SoC: Next-gen DIGIC with integrated NPU supporting 8–16 TOPS (quantized INT8/INT16), or DIGIC + discrete neural accelerator co-processor on the logic board. (Sony Semiconductor)

Lens interface: RF mount with formalized telemetry channels: timestamped focus position, motor current, lens temperature, optional lens IMU. (Canon Europe)

Memory: Low-latency on-die memory for sensor buffers, and NVMe-class host storage for burst buffering.

8.2. Software / pipeline

High-frequency detector (every frame): tiny CNN to produce candidate bboxes + class; runs on NPU at 60–120 Hz.

Tracker manager: maintains active tracks, runs KFs/PFs for each track, fuses lens and IMU telemetry.

Verifier network (on demand): per-track embedding + pose/keypoint net; runs at reduced frequency (10–30 Hz) or on compute budget.

Planner: decides lens trajectories, shutter timing, and capture windows based on predicted subject plane and lens readiness.

Firmware updater & model manager: secure module to update detection/tracking networks and apply profile imports.

9. Algorithms and pseudocode (practical)

Below is high-level pseudocode for the two-stage detector + probabilistic tracker. This is intentionally compact; an expanded implementation would include threads, memory-safe queues, quantized model loading, and device-specific optimizations.

Initialize:
  load detector_model (NPU, tiny)
  load verifier_model (NPU)
  initialize track_list = []
  set classifier_priors per class

Per frame (timestamp t, image I):
  detections = detector_model.run(I)  # bboxes, class_probs, scores

  for each track in track_list:
    # Predict track forward using KF (state: x, v, s)
    track.predict(dt = t - track.last_update)

  # Associate detections -> tracks with gated Hungarian using Mahalanobis
  matches, unmatched_dets, unmatched_tracks = associate(detections, track_list)

  for (det, track) in matches:
    # Update track with measurement
    z = measurement_from(det, lens_telemetry, IMU)
    track.update(z)
    track.last_update = t
    track.confidence = compute_confidence(det.score, embedding_sim)
    if track.confidence < THRESH and compute_budget_allow:
      # run verifier to compute embedding and pose
      emb = verifier_model.extract_embedding(I.crop(det.bbox))
      track.update_embedding(emb)

  for det in unmatched_dets:
    # Initialize new tentative tracks or attempt re-ID with verifier
    emb = verifier_model.extract_embedding(I.crop(det.bbox))
    if emb matches any inactive track within threshold:
      revive track with emb
    else:
      create tentative track with higher process noise

  for track in unmatched_tracks:
    track.miss_count += 1
    if track.miss_count > MISS_LIMIT:
      move track to inactive_pool

  # Planner: compute target_focus_depth using best_active_track
  target = select_primary_track(track_list)
  focus_pos = depth_mapping(target.scale, lens_calibration)
  send_focus_trajectory(focus_pos, deadline = shutter_time_estimate)

  # capture decision: if shutter_time aligns with predicted subject in focus and lens ready => fire

Compute budgeting, quantization, and NPU task scheduling must be implemented to guarantee hard real-time constraints for the high-frequency detector loop. For heavy verifier runs, schedule them during inter-frame micro-gaps or when thermal budget allows. (I can expand this into C++/Rust pseudocode with threading and memory pools if you want.)

10. Evaluation methodology: metrics, datasets, and testing rigs

Engineering progress must be measured. Suggested metrics:

Time-to-focus (TTF) under motion: median and 95th percentile for classed datasets (birds, cars, athletes).

Tracking accuracy: IoU and center-error over time for continuous sequences with occlusions.

Identity retention: % of sequences where the intended subject remains primary after 1 s, 2 s, 5 s in occlusion scenarios.

Capture success rate: % of burst sequences where subject eyes are sharp within tolerance

Power/thermal: inference energy per second and body surface temperature rise.

Datasets:

High-FPS sport/wildlife corpora: curated by Canon with opt-in contributors and partnerships.

Synthetic perturbation sets: simulate netting, reflections, and aggressive lighting to measure failure modes.

Test rigs:

Motion platform: programmable linear/rotary rigs to reproduce predictable trajectories and allow repeatability.

Bird simulators: mechanically actuated wing models for controlled occlusion and flapping tests.

Field validation: measure performance in real capture conditions (stadium, birds at feeders).

11. Roadmap and recommendations (near & mid term)

11.1. Near term (1–2 product cycles)

Integrate moderate NPU into next DIGIC refresh (4–8 TOPS) for detector + verifier workloads; optimize models for INT8 quantization. (Sony Semiconductor)

Release lens telemetry standard v1 enabling timestamped focus position and motor current. Encourage third parties. (Canon Europe)

Expand DPAF orientation capability to more pixels or dynamically switchable patterns to reduce directionality failure modes. (Canon U.S.A.)

11.2. Mid term (3–6 years)

Move to stacked BSI sensors with dedicated AF readout windows and limited on-die pre-processing for focus confidence signals. (ScienceDirect)

Introduce cooperative body-lens control and new pro lenses with high-resolution encoders and embedded IMUs. (The-Digital-Picture.com)

Deploy continuous learning pipeline (opt-in) for domain fine-tuning and push model updates via firmware. (Canon U.S.A.)

12. Risks, ethical considerations, and business implications

Thermal and battery life: NPUs increase loads; ergonomic design must protect run-time and body temperature. (Canon U.S.A.)

Privacy & dataset governance: any data collection must be opt-in and privacy-preserving; avoid training models that enable face recognition unless explicitly requested and consented.

Ecosystem adoption: third-party lens makers must be incentivized to support richer telemetry, or the benefit will be constrained to Canon-native glass.

Complexity of UI: added automation must not reduce predictability for pros; provide both automatic and deterministic manual options.

13. Conclusion: an integrated systems approach

The next major advances in Canon AF will not come from a single innovation but from systems integration: stacking sensor innovations (multi-directional PD, stacked readouts), embedding neural compute for sophisticated detection and learned motion priors, and closing the actuation loop with rich lens telemetry and cooperative control. When those pieces are combined and delivered with careful UX that respects professional workflows (firmware updates, user personalization, explainable feedback), Canon can move beyond the R1/R5 Mark II generation from models that are merely faster or cleverer into ones that are predictably reliable in the hardest real-world scenarios." (Source: ChatGPT2025)

References

Canon. (2018, April 27). Canon autofocus series: Dual Pixel CMOS AF explained. Canon USA. Retrieved from Canon learning/training articles. (Canon U.S.A.)

Canon. (2024). EOS R1 technology overview. Canon Global. Retrieved December 16, 2024. (Canon Global)

Canon USA. (n.d.). EOS R1 body & features. Canon USA product page. (Canon U.S.A.)

Canon USA. (n.d.). EOS R1 support: Dual Pixel CMOS AF cross-type description. Canon support documentation. (Canon U.S.A.)

Canon. (n.d.). RF mount technical explanation. Canon Europe Pro infobank. (Canon Europe)

Sony Semiconductor Solutions. (2024, September 30). IMX500 intelligent vision sensor announcement. Sony Semiconductor Solutions. (Sony Semiconductor)

Element14 Community / Sony IMX500. (2024, Sep 30). Raspberry Pi AI Camera (IMX500). (element14 Community)

Jang, J., & others. (2015). Sensor-based auto-focusing system using multi-scale feature extraction and phase correlation matching. PMC (open access). (PMC)

ScienceDirect. (2025). In-sensor computing for rapid image focusing. (Y. Liu et al.) Article abstract. (ScienceDirect)

Canon. (n.d.). Canon RF lens technology & RF mount advantages. The Digital Picture / Canon lens information. (The-Digital-Picture.com)

Canon USA. (n.d.). EOS R5 Mark II Firmware Notices & updates. Canon support pages. (Canon U.S.A.)

TechRadar. (2024). Raspberry Pi AI camera with Sony IMX500 on-sensor AI. (TechRadar)

Disclaimer

The 'The Future of Canon EOS R AF Systems' report was compiled by ChatGPT on the request of Vernon Chalmers Photography. Vernon Chalmers Photography was not instructed by any person, public / private organisation or 3rd party to request compilation and / or publication of the report on the Vernon Chalmers Photography website.

This independent status report is based on information available at the time of its preparation and is provided for informational purposes only. While every effort has been made to ensure accuracy and completeness, errors and omissions may occur. The compiler of this The Future of Canon EOS R AF Systems report (ChatGPT) and / or Vernon Chalmers Photography (in the capacity as report requester) disclaim any liability for any inaccuracies, errors, or omissions and will not be held responsible for any decisions made based on this information.

26 September 2025

The Future of Canon EOS R AF Systems