DAT Benchmark

Problems and Needs

Targeted Problem

Existing Methods: The PVT task assumes that the target is within the preset frame and the camera remains stationary.

Real-World Task: The target has high dynamic characteristics. Actively controlling the camera to improve visual accuracy is the main challenge.

VAT Task: The VAT task simultaneously models visual tracking and control, and has a wider range of application scenarios.

Reinforcement Learning
for Solving the VAT Task

Existing Methods: Directly connecting the VT model and the control model for VAT (siamask + PID)

Existing Limitations:
1. The delay in the VT model results in lag in control input, which can even cause the controller to diverge.
2. The control model is particu-larly sensitive to parameters, requiring fine-tuning for each scenario.

Advantages of RL: The use of MDP allows simultaneous modeling of vision and control, solving the problem with a single model.

Simulation Environment: Trial-and-error in real-world RL environments and data collection are too expensive, necessitating the construction of simulation environments.

Complex and Diverse
Simulation Environments

Realism: Modeling the diversity of real-world environments while discarding variables that are unattainable in real scenarios.

Diversity: Multiple scenarios, various trackers and targets, multiple sensors.

Unattainable Variables: The position, velocity, and acceleration of the target relative to the tracker.

Problems and Needs

Targeted Problem

Reinforcement Learningfor Solving the VAT Task

Complex and DiverseSimulation Environments

Reinforcement Learning
for Solving the VAT Task

Complex and Diverse
Simulation Environments