AdvancesinMethodologyandStatistics/Metodološkizvezki,Vol.18,No.2,2021,73–88 https://doi.org/10.51936/gktc3784 Timeseriesclusteringbasedontime-varyingHurstexponent AlexBabiš ∗ ,BeátaStehlíková ComeniusUniversity,FacultyofMathematics,PhysicsandInformatics,Bratislava,Slovakia Abstract We consider the problem of clustering time series which are assumed to possess the long termmemory. Weproposeanapproachbasedoncombiningtheresultsobtainedbyapplying differentmethodsforestimatingtime-varyingHurstexponentandapplyittoEuroexchange rates. Firstly, we fit AR-GARCH models to every time series to reduce bias of rescaled rangeanalysismethod. Weonlyconsidermodelwithresiduals,inwhichnoautocorrelation and ARCH effect is present; among them we choose the model with the lowest value of the Bayesian information criterion. Afterwards, we estimate the Hurst exponent from the residualsbymeansoftherollingwindowapproachusingfourdifferentestimationmethods. Vectors of Hurst exponents are clustered for each of the four cases and the clusters are comparedinordertoobtainthefinalclustering. Keywords: Hurstexponent,Clustering,Stockmarket,Timeseries,GARCH 1. Introduction Clustering,alsoknownasclusteranalysis,isanimportanttoolforanalysingdata. Clus- teringmethodspartitiondataintoseveralhomogeneousgroupscalledclusters. Clustersare createdsothatsimilaritybetweenobjectwithinspecificclusterismaximized,whileatthe same time similarity between objects that do not belong to the same cluster is minimized. Clusteringcanbeusedasapartofexploratorydataanalysis,asitallowsustogaininformation fromunderlyingdatawithoutexplicitknowledgeaboutrelationshipbetweenobjectswithin. It can also provide useful insights on the structure of the data and identification of groups containingsimilarobservationsmightbeofinterestonitsown. Usually,informationabout objectsisgivenbyavectoroffeatures. Inmanyapplicationsavectoroffeaturesarisesby observingspecificcharacteristicsofanobjectatdifferenttimeintervals. Resultingvectors havethereforetheformoftimeseries. Clusteringoftimeseriesdatafounditswayintoawiderangeofareas,suchasastronomy, medicine,environmentalanalyses,etc. WereferthereadertoreviewpaperbyAghabozorgi etal.(2015)formoreapplicationsandconcretereferences. Infinance,wherealsoourdataset belongs,applicationsincludeforexampleclusteringstockswithuseinportfoliooptimization ∗ Correspondingauthor Emailaddresses: alexbabis96@gmail.com(AlexBabiš),stehlikova@fmph.uniba.sk(Beáta Stehlíková) 74 BabišandStehlíková (Han&Ge,2020;Iorioetal.,2018;Massahietal.,2020)orclusteringaimingtodiscoverthe structureofcryptocurrenciesmarket(Songetal.,2019). Similarly to general cluster analysis, there is a great number of different methods and approaches. Timeseriesclusteringcanbebasedonclusteringthedatadirectly,onextracting their features or on model which were fit to the data. There are new distance metrics, proposedspecificallyfortimeseries. Moredetailscanbefoundforexampleinsurveypapers (Aghabozorgietal.,2015;Fu,2011)orinarecentbookbyMaharajetal.(2019). Our approach is based on estimating the Hurst exponent of the time series, which is a measureoflong-rangedependenceinthetimeseries. OriginoftheHurstexponentdatesback to1951,whenBritishhydrologistHaroldE.Hurstproposedamethodtooptimizethestorage capacityofreservoirsinanefforttoregulatenaturalcontributionoftheNileriver,keeping inmindcyclicaltrendsuchasperiodsofdroughtandfloods. Hisstatisticalanalysisofthe hydrologicaldatawasnotinaccordwithstandardmodelsofthattimeandsubsequentlylead tomodelsdescribingthebehaviourthatcouldbecharacterizedasalong-rangedependenceor longmemory(O’Connelletal.,2016). ApplicationsoftheHurstexponentinfinanceinclude analysesofinterestrates(Cajueiro&Tabak,2009),hedgefundsperformance(Auer,2016), energyfuturesmarket(Sensoy&Hacihasanoglu,2014),cryptocurrencies(Jiangetal.,2018), efficiencyofstockmarket(Cajueiro&Tabak,2004)andothers. InthesamewayasCajueiroandTabak(2004),weapplytheHurstexponentestimatorsto standardizedresidualsfromAR-GARCHmodel. Itwasshownthatpresenceofshortmemory couldcausebiasofestimatedvalueofHurstexponentandusingthisprocedurewefilterout theshortmemoryinformation. Inthiswayouranalysisoflong-rangememoryisnotaffected byshortmemoryeffectspresentinthedata. PaperbyLahmiri(2016)useddifferentHurst exponentestimatorstoclusterindustrialsectorsatCasablancaStockExchange. Wefollow a similar idea in our approach. However, instead of using a single estimate of the Hurst exponentforthewholetimeseries,weusetherollingwindowapproach. Inothersettingsit hasbeensuccessfullyused,amongothers,inCajueiroandTabak(2004),Jiangetal.(2018), andSensoyandHacihasanoglu(2014). Weusethisapproachforsubsequentlyclusteringthe timeseriesofHurstexponentestimates. Our contribution therefore lies in combining several approaches used in the literature dealing with financial time series and their Hurst exponents individually, but not in this combination—using residuals from AR-GARCH models for the estimation of the Hurst exponent, using rolling window estimates and clustering the time series. Furthermore, we proposeanetworkbasedmethodforclustering,basedontheresultsfromanarbitrarynumber ofclusteringalgorithmsappliedtothedata. Itcanbeusedinamoregeneralsetting,notonly ourchoicesofmethodsforestimatingtheHurstexponentsandtheclusteringprocedures. The rest of the paper is organized as follows. In Section 2 we review the notion of the Hurst exponent and its estimators, which we will use in our analysis. Section 3 presents our data set and Section 4 summarizes the results of GARCH modelling applied the data. Section5showstheHurstexponentestimatesandtheirclusterings. InSection6wecompare theseclustersandsuggestthefinalclusteringofthetimeseries. Weconcludethepaperwith remarks on the methodology, its advantages and shortcomings, and with ideas for future researchinSection7. 2. Long-rangedependenceintime-seriesandestimationoftheHurstexponent Ifthedependencebetweenobservationsofastationarytimeseriesthatarefarapartfrom each other decreases very slowly, as the time distance between them increases, then the Timeseriesclusteringbasedontime-varying... 75 time-seriesissaidtoexhibitlong-rangedependenceorlongmemory. Morespecifically,the autocorrelationsρ(s)=Cor(X t ,X t−s ) decay to zero so slowly, that they are not absolutely summable, i.e. ∑ ∞ s=0 |ρ(s)|=∞. This holds in contrast to ARMA models, for which the autocorrelationfunctiondecaysexponentiallyandthereforethesumofitsabsolutevalues isfinite. Typicallongmemoryprocesshaveρ(s)∼|s| −α withα∈(0,1),as s→∞. Other modelsoflongmemoryprocessesincludeARFIMAmodelswithfractionaldifferences(in contrasttointegerdifferencesinARIMAmodels),theycanbecharacterizedviaspectrum,or socalledHurstexponent. Adetailedtreatmentofthelongmemoryprocessescanbefoundin Beran(2017). Hurstexponent,whichweuseinouranalysis,attainsthevaluesfromtheinterval(0,1) and, if different from 1/2, it is linked to the asymptotic behaviour of the autocorrelation function by the relation ρ(s)∼H(2H−1)|s| 2H−2 . The case H =1/2 corresponds to pro- cesseswithexponentiallydecayingautocorrelations,i.e. withoutthelongmemory. Values H ∈(1/2,1) correspond to persistent processes, while values H ∈(0,1/2) correspond to anti-persistentprocesses. The oldest and probably the best-known method for estimation of the Hurst exponent is Rescaled range (R/S) analysis, originally proposed by Hurst (1951) himself and further developed by Mandelbrot and Wallis (1969). We outline this method according to Weron (2002)andafterwardsweexplainitsmodificationswhichwehaveusedinouranalysis,using theirimplementationintheRpackagepracma(Borchers,2019),inparticularthefunction hurstexp(). Let{X t } L t=1 bestationarytimeseriesoflengthL. TheHurstexponentcanbeestimatedas follows: 1. TimeseriesoflengthLisdividedintod sub-seriesoflengthn. 2. Foreachsub-series,indexedbym,meanE m andstandarddeviationS m arecalculated. 3. DataX i,m arethannormalizedbysubtractingmeanE m : ˆ X i,m =X i,m −E m (i=1,...,n). 4. Next step is to calculate new time series of deviations from mean value for each sub-period: Y i,m = i ∑ j=1 ˆ X j,m (i=1,...,n). 5. TherangeR m iscalculatedas R m =  max{Y 1,m ,...,Y n,m }−min{Y 1,m ,...,Y n,m } . 6. Each range R m is then rescaled/normalized by standard deviation for corresponding sub-periodas R m S m . 7. Finallymeanvalueoftherescaledrangeforallsub-seriesoflengthniscomputed R S (n)= 1 d d ∑ m=1 R m S m 8. Thestepsabovearerepeatedfortheincreasinglengthn. Onlythevaluesofnwhich includefirstandlastpointsoftime-seriesareused,so R S (n)iscalculatedfromthesame numberofobservationsforeachn. 76 BabišandStehlíková It was shown, (cf. Di Matteo, 2007; Mandelbrot, 1975; Mandelbrot & Wallis, 1969; Taqquetal.,1995),that R S statisticsasymptoticallyfollowsrelation R S (n)∼cn H . Takinglogarithmleadsto log  R S (n)  ∼Hlog(n)+log(c). (2.1) It means that in order to estimate value of Hurst exponent H it is sufficient to run simple linearregressionoversampleofincreasingtimeintervaln. The algorithm above has been modified in several ways in the literature. The simplest form of the rescaled range analysis would be not to separate original time-series into m sub-series but rather considered whole time series as suggested originally in Hurst (1951). Thiswouldleadtoonlyone R S (n)statisticswhichmeanstaking log  R S (n)  log(n) would be sufficient enough to estimate Hurst exponent H. This method is referred to as simplifiedrescaledrangeanalysis. Resultsoftherescaledrangeanalysiscandependonthechoiceofthelengthsofsub-series nthatareusedasinputintoregression. Ifthestartingvalueof nischosenasthelengthof theoriginaltimeseriesandthenprogressivelyhalved,thenitwouldpossiblymean,ifn̸=2 i forsomei,thatlastsub-serieswouldbeofdifferentlengthastheallprevious. Theresulting statisticswillbereferredtoascorrectedrescaledrangeanalysis. AbetterwaytoestimateHurstexponentH viaclassicalrescaledrangeanalysiswouldbe toonlyconsiderthoselengthsofsub-seriesnthatthelengthoftheoriginalseriesismultiple ofn. Thisstatisticswillbereferredtoasempiricalrescaledrangeanalysis. AsstatedinAnnisandLloyd(1976)andPeters(1994),forsmallvalueofn,thedeviance oftheslopeintheregression(Equation(2.1))fromitstruevalueissignificanteveninasimple case, when the underlying process is a Gaussian noise. They approximate the theoretical valuesfor R S (n)as E  R S (n)  =                      n− 1 2 n Γ  (n−1) 2  √ πΓ( n 2 ) ∑ n−1 i=1 r n−i i forn≤340, n− 1 2 n 1 r nπ 2 ∑ n−1 i=1 r n−i i forn>340, whereΓistheEulerfunction. AspointedinWeron(2002),theHurstexponentcanbeesti- matedmorepreciselyas0.5plustheslopefromtheregressionof R S (n)−E( R S (n))regressed onlog(n). Thisisreferredtoascorrectedempiricalrescaledrangeanalysis. Timeseriesclusteringbasedontime-varying... 77 3. Data ThedatausedinouranalysisaredailyEuroforeignexchangeratesin2018–2020. They are based on a regular daily concertation procedure between central banks across Europe andavailablebyEuropeanCentralBank. Inparticular,westudytheexchangeratesforthe followingcurrencies: USD(UnitedStatesdollar),JPY(Japaneseyen),CZK(Czechkoruna), DKK(Danishkrone),GBP(Poundsterling),HUF(Hungarianforint),PLN(Polishzłoty), RON (Romanian leu), SEK (Swedish krona), CHF (Swiss franc), ISK (Icelandic króna), NOK(Norwegiankrone),HRK(Croatiankuna),RUB(Russianruble),TRY(Turkishlira), AUD(Australiandollar),BRL(Brazilianreal),CAD(Canadiandollar),CNY(Chineseyuan renminbi), HKD (Hong Kong dollar), IDR (Indonesian rupiah), ILS (Isreali shekel), INR (Indianrupee),KRW(SouthKoreanwon),MXN(Mexicanpeso),MYR(Malaysianringgit), NZD (New Zealand dollar), PHP (Philippine peso), SGD (Singapore dollar), THB (Thai baht),andZAR(SouthAfricanrand). Inordertomakethetimeseriesstationary,wefollowastandardprocedureofworking withdifferencesoflogarithmsoftherates. Figure1showsaselectionofthedata. Wenote thatthevolatilityofthetimeseriesseemstobevaryingintime,whichmotivatesustouse GARCHmodelsfortheirmodelling. Findingparticularreasonsfornonconstantvolatilityin theexchangeratesdatawouldneedastandaloneanalysis. Here,weonlynotethatthisisnot anewphenomenon;ithasbeenstudiedinmanypapers(e.g., Fengetal.,2021;Kido,2016; Manassehetal.,2019;You&Liu,2020;Zhouetal.,2020). Figure1: Sampleofthedata,differencesoflogarithmsoftheselectedexchangerates 4. GARCHmodels LetusrecallthatastandardautoregressiveAR(p)modelforstationarytimeseriex t takes theform x t =δ+a 1 x t−1 +···+a p x t−p +u t , where the error term u is a white noise. The parameters are required to satisfy certain condition to ensure stationarity of the process (Kirchgässner et al., 2013). However, in financial applications it is often the case that the assumption of a constant variance of the 78 BabišandStehlíková whitenoiseisnotconsistentwithobserveddata. Thetimevaryingvarianceofthedatacan be captured by GARCH processes, which model the variance σ 2 t of the process u t by the equation σ 2 t =ω+α 1 u 2 t−1 +···+α p u 2 t−p +β 1 σ 2 t−1 +···+β q σ 2 t−q , where again the parameters are required to satisfy stationarity conditions. This process is knownasGARCH(p,q)process;wereferthereadertoKirchgässneretal.(2013)fordetails. WeusegarchFit()functionfromtheRpackagefGarch(Wuertzetal.,2020)toestimate GARCHmodelsandtoobtainresultsofstatisticaltestsnecessaryforevaluatingthemodels. TheresidualsofthemodelsaretestedinordertoassessthesuitabilityoftheproposedGARCH models. Following the standard procedures, implemented in the fGarch package, we use theLjung-Boxtestfortheresidualsandthesquaredresidualsandtheheteroscedasticitytest. InthemodelselectionweconsiderautoregressiveAR(p)processeswithorders p≤3with GARCH(p,q)errortermwithorderssatisfying p+q≤3. Fromthemodelswithresiduals passingthetestsgivenaboveon5%significancelevel,weselectthemodelwiththelowest Bayesian information criterion. Exchange rates IDR (Indonesian rupiah) and ILS (Isreali shekel)wereexcludedfromthedataduetofactthatnoneofthemodelconsideredwassuitable forthem. TheresultingmodelsfortheremainingexchangeratesaregiveninTable1. Table1: AutoregressivemodelswithGARCHerrors Model Exchangerates AR(0)+GARCH(1,1) USD, DKK, HUF, JPY, GBP, SEK, RUB, AUD, CNY, CHF, BRL, HKD, INR, MXN, NZD,CAD,SGD,THB,ZAR AR(0)+GARCH(2,1) CZK AR(0)+GARCH(1,2) NOK,KRW,MYR,PHP AR(1)+GARCH(1,1) HRK,TRY AR(2)+GARCH(1,1) PLN AR(3)+GARCH(1,2) RON 5. TimevaryingHurstexponentsandtheirclustering As outlined in the introduction, we use the approach from Cajueiro and Tabak (2004), Jiang et al. (2018), and Sensoy and Hacihasanoglu (2014), and we do not represent time seriesbyasingleestimatedHurstexponent. Instead,werepresentitbysequenceofHurst exponentsestimatedfromshortertimewindowstocaptureregimechangeswithindata. The mainreasonisthatinmanyfinancialtimeserieswecanobservecyclesofirregularlength in which the dynamics varies. It is reasonable to assume that this would be also true even for Hurst exponent. Another reason for choosing a sequence of Hurst exponents over one particular Hurst exponent estimate would be that we might be also interested in studying reactionofexchangeratesdynamicsduringspecifictimewindowoninformationthatwere dominatingthroughthespecifictime. WechoosearollingwindowapproachtoestimatesequenceofHurstexponentforeach exchangeratewithwindowsizeselectedtobe252days,whichisapproximatelyoneyearof data(sincethedataareavailableonlyonbusinessdays). Thismeansthatforeachsequence Timeseriesclusteringbasedontime-varying... 79 {X j } i+w−1 j=i with wbeingsizeofwindowand i=1,...,n−w+1,weestimated H i asHurst exponentforparticulartimeperiod. ThisresultsinsequenceofHurstexponents{H i } n−w+1 i=1 . For determining clusters, hierarchical clustering was employed with Ward’s minimum variance method using function hclust() from the statsR package. The distance was chosenassquaredEuclideandistancebetweenvectorsoftime-varyingHurstexponentswhich isrequiredduetousageoftheWard’salgorithm. Wenotethatapopularsimilaritymeasure based on correlations is not applicable here. Two evolutions of Hurst exponents, which differ by a constant, have a perfect correlation. However, they might be on opposite sides of H =1/2 and thus exhibiting different characteristics, which we would like to take into account. Todeterminethenumberofclusters,weusedsilhouettecriterion(Rousseeuw,1987) usingfunctionsilhouette() fromclusterRpackage(Maechleretal.,2019). We employed four different calculations of Hurst exponent, as described in Section 2, resulting in four different vectors of time-varying Hurst exponent for each exchange rate. Examplesofthetime-varyingHurstexponentsareshowninFigure2. Ascanbeseen,time- varyingHurstexponentsforparticularexchangeratesignificantlydiffersbyusedestimation technique so it is meaningful to carry out cluster analysis for every one of them. Thus, resultingin4clusteringsofexchangeratemarket. Dendrogramsandtheresultingclustersare presentedinFigures3–5. Figure2: TimedependentestimatesoftheHurstexponentsforselectedcurrencies(bottom figures),togetherwiththeoriginaldata(top)andstandardizedresiduals(middle) 6. Comparisonofclusteringsandfinalclusters Clusterings presented in the previous section are not identical; however, in the case of certain pairs of exchange rates, they were in the same cluster in all four clusterings. We 80 BabišandStehlíková Figure3: HierarchicalclusteringofthecurrenciesbasedonsimplifiedHurstexponentclus- teredbyWard’salgorithmusingsquaredEuclideandistanceassimilaritymeasure. Optimal clustersselectedviasilhouettecriterionarevisualisedbythedashedframes. Figure4: HierarchicalclusteringofthecurrenciesbasedoncorrectedHurstexponentclus- teredbyWard’salgorithmusingsquaredEuclideandistanceassimilaritymeasure. Optimal clustersselectedviasilhouettecriterionarevisualisedbythedashedframes. Timeseriesclusteringbasedontime-varying... 81 Figure5: HierarchicalclusteringofthecurrenciesbasedonempiricalHurstexponentclus- teredbyWard’salgorithmusingsquaredEuclideandistanceassimilaritymeasure. Optimal clustersselectedviasilhouettecriterionarevisualisedbythedashedframes. Figure 6: Hierarchical clustering of the currencies based on corrected empirical Hurst exponent clustered by Ward’s algorithm using squared Euclidean distance as similarity measure. Optimal clusters selected via silhouette criterion are visualised by the dashed frames. 82 BabišandStehlíková considerthistobeastrongindicatorthattheHurstexponenthasasimilarevolutionforthese tworates. Asthenumberofsuchcasesdecreases,alsothesimilaritycanbeseenasweaker. Naturally,inmanycases,thegivenpairoftherateswasneverinthesamecluster. Weassociatetheclusteringresultswithanetwork,whosenodesaretheexchangerates. Two nodes are connected by an edge, if they were in the same cluster at least once. The weightoftheedgeisgivenbythenumberofsuchclustering. Theresultingnetworkisshown inFigure7. Ifweconsidertheedgeswiththeweightaboveacertainthreshold,thenetworksplitsinto several connected components. The nodes in these components are therefore representing setsofexchangerates,forwhichtheevolutionoftheHurstexponentissimilar. Therefore, we take the connected components as the final clusters of our analysis. The choice of the threshold is subjective, we base it on visualizing the networks corresponding to different thresholds. Dependingofthedata,wemightneedtofindatrade-offbetweenalargenumber ofsmallcomponentswithstrongconnectionsbetweenthenodes,andasmallnumberoflarge componentswithweakerties. Inourparticularcasewecomparethecomponentsemerging fromthethresholds4(themaximumpossibleweightofanedge)and3(whichmeansthat the exchange rates have to be in the same clusters at least 3 times out of 4, in order to be connected by an edge in the network). We do not consider lower values for a threshold; requiringtheedgeoftheweighttobeatleast3meansthatthenodesconnectedbyanedge mustbeinthesameclusterinmorethanhalfofthecases. Theclustersconsistingofmore thanonenodearepresentedinTables2and3. Both clusterings seem reasonable. We can identify nodes in the clusters which can be expectedtobeinthesameclusterbasedonthedependenceoftheeconomiesandfinancial marketsinthegivencountries. Inthenetworkfromthethreshold4,weseeasmallcluster containingUnitedStatesdollarandHongKongdollar. Alargercluster,containingsixnodes, includesexchangeratesofcurrenciesincountrieslocatedinthesouth,southeastandeastAsia -Indianrupee,SouthKoreanwon,Malaysianringgit,Philippinepeso,Singaporedollar,Thai baht. Wenote,however,thatbeinginthesameclusterdoesnotmeanasimilarevolutionofthe exchangerateitself. Instead,itmeansasimilarevolutionoftheHurstexponent. Therefore, amoredetailedinterpretationoftheclusterswouldneedamorecarefulonthefactorsthat mightinfluencethisfeatureoftheexchangerates. Table 2: Final clustering of the exchange rates for the threshold 4— nontrivialclusters(containingmorethanoneexchangerate) Cluster Exchangerates 1 USD,HKD 2 JPY,DKK,RON,CHF 3 CZK,NOK,HRK,RUB,AUD 4 GBP,BRL 5 HUF,CNY 6 PLN,SEK 7 INR,KRW,MYR,PHP,SGD,TBH Timeseriesclusteringbasedontime-varying... 83 Figure7: NetworkconstructedfromtheclusteringsinFigures3–6withtheverticescorre- spondingtoexchangeratesandtheweightoftheedgescorrespondingtonumberoftimes eachpairofexchangeratesendeduptogetherinacluster. Theweightoftheedgeisvisualized bythewidthoftheline,thetypeofthelineandbyitscolour(1=thin,dashed,light-grey,2= thin,solid,grey,3=thick,dashed,green,4=thick,solid,red). Table 3: Final clustering of the exchange rates for the threshold 3— nontrivialclusters(containingmorethanoneexchangerate) Cluster Exchangerates 1 USD,HUF,CAD,CNY,HKD,INR,KRW,MXN,MYR,PHP, SGD,THB,ZAR 2 JPY,DKK,RON,CHF 3 CZK,NOK,HRK,RUB,AUD,NZD 4 GBP,BRL 5 PLN,SEK 7. Conclusions Many financial time series exhibit long-range dependence. We used this property for clustering the time series based on Hurst exponent which measures this dependence. We 84 BabišandStehlíková Figure 8: Network obtained from Figure 7 by only considering edges with weights equal to4. Theverticescorrespondtoexchangeratesandtheweightsoftheedgescorrespondto numberoftimeseachpairofexchangeratesendeduptogetherinacluster. proposedaclusteringprocedurewhichusesseveraldifferentestimatesoftheHurstexponent andclusterstheirvaluesobtainedbyarollingwindowmethod. Asafinalstepofourprocedure, clusteringsoriginatingfromindividualmethodsoftheHurstexponentwerecompared. Inour exampleofexchangerates,itturnsoutthatweareabletocreatefinalclusteringbyrequiring thatmembersofeachclusterareinthesameclusteratleastthespecifiednumberoftimesin the individual clusterings. We expect the same to hold also in the case of other data since “similar time series” should appear in the same cluster often, when considering different detailsofclusteringprocedure. Therefore,ourapproachcanbedirectlyappliedalsotoother timeseries. Theresultswhichwehaveobtainedprovideanewapplicationofrollingwindowapproach toHurstexponentestimation,usedearlierinCajueiroandTabak(2004),Jiangetal.(2018), and Sensoy and Hacihasanoglu (2014). Moreover, they make it possible to extend other clusteringanalysessuchasLahmiri(2016),byallowingtousemorethanonetimecriterion (e.g.,severalestimationmethodsinourparticularcase). The extension of our results can go in two directions. The first one consists of a more detailed interpretation of the clustering. As we noted, the estimates of the Hurst exponent provide an information about the underlying time series and we might study the external Timeseriesclusteringbasedontime-varying... 85 Figure9: Networkobtainedfrom7byonlyconsideringedgeswithweightsgreaterthan2. Theverticescorrespondtoexchangeratesandtheweightsoftheedgescorrespondtonumber oftimeseachpairofexchangeratesendeduptogetherinacluster. Theweightoftheedgeis visualizedbythetypeofthelineandbyitscolour(3=dashed,green,4=solid,red). factorswhichleadtothisbehaviourofthedata. Thismightalsogiveabetterunderstanding ofclustersandwhycertainexchangerates(orotherdataconsidered)appearinthesameorin differentclusters,respectively. Theotherdirectioninvolvesusingdifferentmethodstoconstructindividualclusterings. Thefinalcomparisonofclustersdoesnothaveanylimitationonthenumberofclusterings whichenterit,neitheronmethodsusedtoobtainthem. Therearemanymethodsforestimating Hurstexponents,otherdistancesbetweenvectorsofHurstexponentsmaybeconsidered,we mayusedifferentclusteringmethods. Individualclusteringsmightgetdifferentweightsand instead of counting the number of occurrences in the same cluster, it is possible to weight them. Itmightbeinsightfultoseehowthesedifferentapproachesinfluencethefinalclusters. A possible limitation might be the need of finding a suitable trade-off between clearly distinguishedcomponentsinthenetworkandthenumberofisolatednodes,corresponding toclusterscontainingonetimeseries. Iftheconditionfortheexistenceofanedgebetween nodesisnotsufficientlystrict,i.e.,onlyasmallnumberofoccurrencesinthesamecluster is required, the components are often large, which may not be always desirable. On the other hand, a high threshold often leaves a lot of nodes without an edge, leading to one- 86 BabišandStehlíková element clusters. However, we may be interested in finding similar time series to most of thedata,insteadofconcludingthattheyformaseparatecluster. Apossiblesolutionmight be a modification of our final clustering step. Instead of considering the components of thenetwork,variousmethodsforfindingsocalledcommunitiesinconnectednetworkscan be employed. They aim to divide the nodes into communities, which are characterized by manyedgeswithinthenodesinacommunityandasmallnumberofedgesbetweennodesin differentcommunities. ReviewsofsuchmethodscanbefoundinFortunato(2010)andJaved et al. (2018). The proposed method and its possible modifications outlined above provide a new approach for clustering time series using networks and communities, considered in FerreiraandZhao(2015). Toconclude,wenoteagainthattheproposedapproachcanbeusedtoanalyzeanytime serieswithlong-rangedependence,ortimeseriesforwhichtheirregimes—persistent,anti- persistent or having quickly decaying correlations—need to be distinguished. Therefore weconsiderittobeaninterestingadditiontothetopicofclusteringtimeserieswiththese properties. Acknowledgment WeacknowledgethecontributionoftheSlovakResearchandDevelopmentAgencyunder theprojectAPVV-20-0311. References Aghabozorgi,S.,Shirkhorshidi,A.S.,&Wah,T.Y.(2015).Time-seriesclustering–adecade review.InformationSystems,53,16–38.https://doi.org/10.1016/j.is.2015.04.007 Annis,A.A.,&Lloyd,E.H.(1976).Theexpectedvalueoftheadjustedrescaledhurstrange of independent normal summands. Biometrika, 63(1), 111–116. https://doi.org/10. 1093/biomet/63.1.111 Auer, B. R. (2016). Pure return persistence, hurst exponents and hedge fund selection – a practicalnote.JournalofAssetManagement,17(5),319–330.https://doi.org/10.1057/ jam.2016.7 Beran,J.(2017).Statisticsforlong-memoryprocesses.Routledge.https://doi.org/10.1201/ 9780203738481 Borchers,H.W.(2019).pracma:Practicalnumericalmathfunctions(Version2.2.9)[Com- putersoftware].TheComprehensiveRArchiveNetwork.https://cran.r-project.org/ package=pracma Cajueiro,D.O.,&Tabak,B.M.(2004).Thehurstexponentovertime:Testingtheassertion thatemergingmarketsarebecomingmoreefficient.PhysicaA:StatisticalMechanics anditsApplications,336(3-4),521–537.https://doi.org/10.1016/j.physa.2003.12.031 Cajueiro,D.O.,&Tabak,B.M.(2009).Testingforlong-rangedependenceintheBrazilian termstructureofinterestrates.Chaos,Solitons&Fractals,40(4),1559–1573.https: //doi.org/10.1016/j.chaos.2007.09.054 Di Matteo, T. (2007). Multi-scaling in finance. Quantitative Finance, 7(1), 21–36. https: //doi.org/10.1080/14697680600969727 Feng, G.-F., Yang, H.-C., Gong, Q., & Chang, C.-P. (2021). What is the exchange rate volatilityresponsetoCOVID-19andgovernmentinterventions?EconomicAnalysis andPolicy,69,705–719.https://doi.org/10.1016/j.eap.2021.01.018 Timeseriesclusteringbasedontime-varying... 87 Ferreira,L.N.,&Zhao,L.(2015).Atimeseriesclusteringtechniquebasedoncommunity detectioninnetworks.ProcediaComputerScience,53,183–190.https://doi.org/10. 1016/j.procs.2015.07.293 Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3-5), 75–174. https://doi.org/10.1016/j.physrep.2009.11.002 Fu,T.-c.(2011).Areviewontimeseriesdatamining.EngineeringApplicationsofArtificial Intelligence,24(1),164–181.https://doi.org/10.1016/j.engappai.2010.09.007 Han,J.,&Ge,Z.(2020).Effectofdimensionalityreductiononstockselectionwithcluster analysisindifferentmarketsituations.ExpertSystemswithApplications,147,113226. https://doi.org/10.1016/j.eswa.2020.113226 Hurst,H.E.(1951).Long-termstoragecapacityofreservoirs.TransactionsoftheAmerican SocietyofCivilEngineers,116(1),770–799.https://doi.org/10.1061/taceat.0006518 Iorio, C., Frasso, G., D’Ambrosio, A., & Siciliano, R. (2018). A p-spline based clustering approachforportfolioselection.ExpertSystemswithApplications,95,88–103.https: //doi.org/10.1016/j.eswa.2017.11.031 Javed,M.A.,Younis,M.S.,Latif,S.,Qadir,J.,&Baig,A.(2018).Communitydetectionin networks:Amultidisciplinaryreview.JournalofNetworkandComputerApplications, 108,87–111.https://doi.org/10.1016/j.jnca.2018.02.011 Jiang,Y.,Nie,H.,&Ruan,W.(2018).Time-varyinglong-termmemoryinbitcoinmarket. FinanceResearchLetters,25,280–284.https://doi.org/10.1016/j.frl.2017.12.009 Kido,Y.(2016).OnthelinkbetweentheUSeconomicpolicyuncertaintyandexchangerates. EconomicsLetters,144,49–52.https://doi.org/10.1016/j.econlet.2016.04.022 Kirchgässner, G., Wolters, J., & Hassler, U. (2013). Introduction to modern time series analysis.Springer.https://doi.org/10.1007/978-3-642-33436-8 Lahmiri,S.(2016).ClusteringofCasablancastockmarketbasedonhurstexponentestimates. PhysicaA:StatisticalMechanicsanditsApplications,456,310–318.https://doi.org/ 10.1016/j.physa.2016.03.069 Maechler,M.,Rousseeuw,P.,Struyf,A.,Hubert,M.,&Hornik,K.(2019).pracma:Practical numericalmathfunctions(Version2.1.0)[Computersoftware].TheComprehensive RArchiveNetwork.https://cran.r-project.org/package=cluster Maharaj, E. A., D’Urso, P., & Caiado, J. (2019). Time series clustering and classification. Chapman;Hall/CRC.https://doi.org/10.1201/9780429058264 Manasseh, C. O., Chukwu, N. O., Abada, F. C., Ogbuabor, J. E., Lawal, A. I., & Alio, F.C.(2019).Interactions between stockpricesandexchange rates:An application ofmultivariateVAR-GARCHmodel.CogentEconomics&Finance,7(1),1681573. https://doi.org/10.1080/23322039.2019.1681573 Mandelbrot, B. B. (1975). Limit theorems on the self-normalized range for weakly and strongly dependent processes. Zeitschrift für Wahrscheinlichkeitstheorie und Ver- wandteGebiete,31(4),271–285.https://doi.org/10.1007/bf00532867 Mandelbrot, B. B., & Wallis, J. R. (1969). Robustness of the rescaled range R/S in the measurementofnoncycliclongrunstatisticaldependence.WaterResourcesResearch, 5(5),967–988.https://doi.org/10.1029/wr005i005p00967 Massahi,M.,Mahootchi,M.,&ArshadiKhamseh,A.(2020).Developmentofanefficient cluster-basedportfoliooptimizationmodelunderrealisticmarketconditions.Empiri- calEconomics,59(5),2423–2442.https://doi.org/10.1007/s00181-019-01802-5 O’Connell,P.,Koutsoyiannis,D.,Lins,H.F.,Markonis,Y.,Montanari,A.,&Cohn,T.(2016). The scientific legacy of Harold Edwin hurst (1880–1978). Hydrological Sciences Journal,61(9),1571–1590.https://doi.org/10.1080/02626667.2015.1125998 88 BabišandStehlíková Peters, E. E. (1994). Fractal market analysis: Applying chaos theory to investment and economics.JohnWiley&Sons. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation ofclusteranalysis.JournalofComputationalandAppliedMathematics,20,53–65. https://doi.org/10.1016/0377-0427(87)90125-7 Sensoy, A., & Hacihasanoglu, E. (2014). Time-varying long range dependence in energy futures markets. Energy Economics, 46, 318–327. https://doi.org/10.1016/j.eneco. 2014.09.023 Song,J.Y.,Chang,W.,&Song,J.W.(2019).Clusteranalysisonthestructureofthecryp- tocurrencymarketviaBitcoin–Ethereumfiltering.PhysicaA:StatisticalMechanics anditsApplications,527,121339.https://doi.org/10.1016/j.physa.2019.121339 Taqqu,M.S.,Teverovsky,V.,&Willinger,W.(1995).Estimatorsforlong-rangedependence: An empirical study. Fractals. Complex Geometry, Patterns, and Scaling in Nature andSociety,03(04),785–798.https://doi.org/10.1142/s0218348x95000692 Weron,R.(2002).Estimatinglong-rangedependence:Finitesamplepropertiesandconfidence intervals.PhysicaA:StatisticalMechanicsanditsApplications,312(1-2),285–299. https://doi.org/10.1016/s0378-4371(02)00961-5 Wuertz, D., Setz, T., Chalabi, Y., Boudt, C., Chausse, P., & Miklovac, M. (2020). fGarch: Rmetrics-autoregressiveconditionalheteroskedasticmodelling(Version3042.83.2) [Computersoftware].TheComprehensiveRArchiveNetwork.https://cran.r-project. org/package=fGarch You, Y., & Liu, X. (2020). Forecasting short-run exchange rate volatility with monetary fundamentals: A GARCH-MIDAS approach. Journal of Banking & Finance, 116, 105849.https://doi.org/10.1016/j.jbankfin.2020.105849 Zhou, Z., Fu, Z., Jiang, Y., Zeng, X., & Lin, L. (2020). Can economic policy uncertainty predict exchange rate volatility? New evidence from the GARCH-MIDAS model. FinanceResearchLetters,34,101258.https://doi.org/10.1016/j.frl.2019.08.006