1 Introduction
1.1 Motivation
1.2 Research questions
1.3 Research outline
2 The distributed data management environment
2.1 The World LHC Computing Grid
2.2 The File Transfer Service
2.3 Rucio
2.3.1 Rucio Data IDentiers
2.3.2 Rucio Storage Elements
2.3.3 Replication rules and subscriptions
2.3.4 Replica management and transfers
3 Data selection and model metrics
3.1 Rucio data extraction and selection
3.1.1 Transfers and Deletions
3.1.2 FTS Server
3.1.3 TAPE activities
3.1.4 Failed transfers
3.1.5 Data extraction and treatment
3.2 Metric election
3.2.1 MSE and RMSE
3.2.2 MEA and MedAE
3.2.3 MSLE and RMSLE
3.2.4 Explained Variance and R2 Score
3.2.5 Mean Tweedie Deviance
3.2.6 MAPE and RE
3.2.7 FoGP
3.2.8 Metrics comparison experiment
4 Model of intra-rule Rule TTC extrapolation
4.1 Transfers per rule distribution
4.2 The α and α0 models
4.3 Evaluation of results
5 Model of Rule TTC based on time series analysis
5.1 Problem framing
5.2 The β models
5.3 The γ models
6 Model of Rule TTC based on deep neural networks
6.1 The δn Model
6.2 The δννn Model
6.3 Comparison of the models performance
7 Network time to predict Transfer TTC and Rule TTC
7.1 Network Time for a single transfer
7.2 Network Time as a Transfer TTC and Rule TTC estimator
7.3 Results
8 FTS Queue Time to predict Transfer TTC and Rule TTC
8.1 FTS queue modeling
8.2 Modeling the FTS queue from Rucio data
8.3 Using FTS Queue Time as a Transfer TTC and a Rule TTC
predictor
9 Results and conclusion
9.1 Models summary
9.2 Model κ
9.3 Model α
9.4 Models β(t0, ρ) and β∗(t0, ρ)
9.5 Model γ(t0, ρ, λ, ψ, ω)
9.6 Models δ and δνν
9.7 Models based on individual transfers
9.8 Conclusion and nal remarks
10 Future work
10.1 Possible extensions to the δνν model
10.2 More complex auto-regressive models