fig8
Figure 8. Comparison of training results in different scenarios, including training data for TD3 with load clustering (A), TD3 with prior knowledge and load clustering (B), and baseline scenario (C). TD3: Twin delayed deep deterministic policy gradient; TD3base: this primary scenario does not employ load clustering and does not incorporate expert knowledge into the reward function; TD3c2: scenario with clustering under heating electricity load; TD3c1: scenario with clustering under cooling electricity load; TD3c0:scenario with clustering under lower-level mixed cooling, heating, and electricity;TD3c2-wpk: scenario with clustering and expert knowledge under heating electricity load; TD3c1-wpk: scenario with clustering and expert knowledge under cooling electricity load ; TD3c0-wpk: scenario with clustering and expert knowledge under lower-level mixed cooling, heating, and electricity load..



