In the pursuit of maximizing hardware utilization for Large Language Model (LLM) Reinforcement Learning, systems introduces some new sampling strategies like Partial Rollout and Asynchornous Rollout. This introduces complex off-policy data structures, which we categorize into Sliding Latest Policy Trajectories (SLAPTs) and Multiple Consistent Stale Policy Trajectories (MCSPTs). This post discusses the properties of applying Importance Sampling (IS) to these distinct data forms, combining theoretical notations like Multiple Importance Sampling and Rényi divergence and practical observations like the (approximate) indentical distribution structure of SLAPTs of similar lengths.