scieee Science in your language
[en] (orig)

Design and implementation of a distributed platform for data mining of big astronomical spectra archives

Author: Koza, Jakub
Publisher: Zenodo
DOI: 10.5281/zenodo.17537102
Source: https://zenodo.org/records/17537102/files/F8-BP-2015-Koza-Jakub-thesis-2.pdf
Inse he e you hesis’ ask.
Czech Technical Uni e si y in P ague
Facul y o In o ma ion Technology
Depa men o So wa e Enginee ing
Bachelo ’s hesis
Design and implemen a ion o a
dis ibu ed pla o m o da a mining o big
as onomical spec a a chi es
Jakub Koza
Supe iso : RND . Pe ˇ
Skoda, CSc.
12 h May 2015
Acknowledgemen s
I would like o hank my supe iso , RND . Pe ˇ
Skoda, CSc., o his help and
o gi ing me his oppo uni y, and o Lum´ı M k a, he au ho o he o iginal
VO-CLOUD sys em, o suppo in he beginning o my implemen a ion. We
also acknowledge suppo o g an GAˇ
CR 13-08195S

Decla a ion
I he eby decla e ha he p esen ed hesis is my own wo k and ha I ha e
ci ed all sou ces o in o ma ion in acco dance wi h he Guideline o adhe ing
o e hical p inciples when elabo a ing an academic inal hesis.
I acknowledge ha my hesis is subjec o he igh s and obliga ions s ip-
ula ed by he Ac No. 121/2000 Coll., he Copy igh Ac , as amended. In
acco dance wi h A icle 46(6) o he Ac , I he eby g an a nonexclusi e au ho -
iza ion (license) o u ilize his hesis, including any and all compu e p og ams
inco po a ed he ein o a ached he e o and all co esponding documen a ion
(he eina e collec i ely e e ed o as he “Wo k”), o any and all pe sons ha
wish o u ilize he Wo k. Such pe sons a e en i led o use he Wo k in any
way (including o -p o i pu poses) ha does no de ac om i s alue. This
au ho iza ion is no limi ed in e ms o ime, loca ion and quan i y. Howe e ,
all pe sons ha makes use o he abo e license shall be obliged o g an a
license a leas in he same scope as de ined abo e wi h espec o each and
e e y wo k ha is c ea ed (wholly o in pa ) based on he Wo k, by modi-
ying he Wo k, by combining he Wo k wi h ano he wo k, by including he
Wo k in a collec ion o wo ks o by adap ing he Wo k (including ansla ion),
and a he same ime make a ailable he sou ce code o such wo k a leas in a
way and scope ha a e compa able o he way and scope in which he sou ce
code o he Wo k is made a ailable.
In P ague on 12 h May 2015 . . . . . . . . . . . . . . . . . . . . .
Czech Technical Uni e si y in P ague
Facul y o In o ma ion Technology
c
2015 Jakub Koza. All igh s ese ed.
This hesis is school wo k as de ined by Copy igh Ac o he Czech Republic.
I has been submi ed a Czech Technical Uni e si y in P ague, Facul y o
In o ma ion Technology. The hesis is p o ec ed by he Copy igh Ac and i s
usage wi hou au ho ’s pe mission is p ohibi ed (wi h excep ions de ined by he
Copy igh Ac ).
Ci a ion o his hesis
Koza, Jakub. Design and implemen a ion o a dis ibu ed pla o m o da a
mining o big as onomical spec a a chi es. Bachelo ’s hesis. Czech Tech-
nical Uni e si y in P ague, Facul y o In o ma ion Technology, 2015.
Abs ak
C´ılem ´e o bakal´aˇ sk´e p ´ace je ozˇs´ıˇ i s ´a aj´ıc´ı dis ibuo an´y sys ´em VO-
CLOUD, k e ´y posky uje uˇzi a el˚um p os o a ´ykon p o y ´aˇ en´ı ´ypoˇce nˇe
n´a oˇcn´ych as onomick´ych expe imen ˚u sk ze ozh an´ı webo ´eho p os ˇ ed´ı.
V´ysledn´y sys ´em je schopn´y z´ısk´a a s upn´ı da a pˇ ´ımo z as onomick´ych
a ch´ı ˚u pomoc´ı speci´aln´ıch as onomick´ych p o okol˚u SSAP a Da aLink. D´ale
je schopn´y delego a ´ypoˇc y na dis ibuo an´y ´ypoˇce n´ı s oj a je schopn´y
izualizo a ´ysledky ´ypoˇc ˚u uˇzi a eli pˇ ´ımo e webo ´em p os ˇ ed´ı.
Kl´ıˇco ´a slo a Vi u´aln´ı Obse a oˇ , SSAP, Da aLink, UWS, Ja a EE, as-
oin o ma ika
Abs ac
The aim o his bachelo ’s hesis is o ex end cu en dis ibu ed sys em VO-
CLOUD capable o p o iding use s wi h a s o age and compu abili y o con-
duc as onomical expe imen s in a web based en i onmen . The esul ing sys-
em is capable o downloading inpu da a di ec ly om as onomical a chi es
by using special as onomical p o ocols SSAP and Da aLink. The sys em is
able o delega e compu a ions on a dis ibu ed compu a ional machine and i
ix

In oduc ion
The esea ch o he nigh sky ha e d as ically changed in he las ew yea s
hanks o mode niza ion o in o ma ion echnologies. Whe eas in he pas
an as onome had o wai e en a couple mon hs o access he elescope,
oday he has almos immedia e access o da a hanks o sys em called Vi ual
Obse a o y (VO), in which he as as onomical a chi es and da abases
a ound he wo ld, oge he wi h analysis ools and compu a ional se ices,
a e linked oge he in o an in eg a ed acili y [1].
VO-CLOUD (o iginally called VO-KOREL which was ex ended by a i-
ous da a mining capabili ies) is he sys em implemen ing basic p inciples and
concep s o Vi ual Obse a o y, whe e as onome s can conduc hei expe -
imen s wi h compu a ionally in ensi e da a mining algo i hms and isualize
hem in amilia and iendly g aphical in e ace [2]. The signi ican disad an -
age o his pa icula dis ibu ed sys em is he ac ha da a ha usually may
be qui e big o he upcoming expe imen ha e o be p epa ed in ad ance in
he local s o age o he expe imen e and uploaded o he VO-CLOUD se e
each ime du ing he expe imen c ea ion.
The aim o his wo k is o analyse wo k low as well as implemen a ion
o con empo a y VO-CLOUD se e and i s dis ibu ed wo ke s (execu ion
uni s o expe imen s), and pe o m design and implemen a ion o he new
e sion o VO-CLOUD se e ha will be capable o di ec ly downloading
da a om emo e esou ces using p o ocols like HTTP, FTP as well as IVOA
speci ic p o ocol such as SSAP, Da aLink. Downloaded da a will be a ailable
o submission wi hin expe imen o he assigned wo ke s o compu a ion and
hen he esul s will be au oma ically downloaded back o VO-CLOUD se e
o possible isualiza ion.
1
Chap e 1
Analysis o he cu en solu ion
In his sec ion I would like o b ie ly desc ibe he s a e o cu en ly implemen-
ed s a e o VO-CLOUD and he echnologies in ol ed.
1.1 A chi ec u e
VO-CLOUD is dis ibu ed sys em which means ha i is consis ed o ha dwa e
o so wa e componen s loca ed a ne wo ked compu e s ha communica e
and coo dina e hei ac ions only by passing messages o achie e hei ask
[3]. In he case o he VO-CLOUD sys em is composed o he ollowing pa s:
•One mas e se e capable o communica ion wi h he expe imen ing
use
•Se e al dis ibu ed nodes called wo ke s ha con ain:
–Bina y iles desc ibing he long ime unning compu a ional p ocess
–Simple applica ion wi h he abili y o communica e wi h he mas e
se e and o s a compu a ional p ocess and o dispa ch equi ed
da a o i
The mas e se e is he mos impo an componen o VO-CLOUD se e .
I s main pu pose is o p o ide web in e ace o communica ion wi h an ex-
pe imen ing use and o delega e eques ed expe imen compu a ions o he
chosen wo ke . Mas e se e s o es he in o ma ion abou all expe imen s in
a da abase and pe iodically checks he s a e o expe imen s o see i hei ex-
ecu ion is al eady inished. In he posi i e case, esul s a e downloaded om
wo ke back o he mas e se e and dele ed on he wo ke side.
Wo ke s in he cu en solu ion a e dis inguished by he ype o he ex-
pe imen compu a ion hey can execu e. Fo example one ype o wo ke
could execu e he p ocess pe o ming Random Decision Fo es s me hod used
3
1. Analysis o he cu en solu ion
Figu e 1.1: Deploymen diag am o he cu en solu ion
o da a mine in o ma ion om passed as onomical spec um da a [4]. Gene -
ally, e e y wo ke consis s o bina y iles which a e execu ed o e da a as he
new long ime unning p ocess, and he ligh weigh applica ion which manages
he queue o execu ing jobs and s a s he long unning compu a ional p o-
cess. Al hough he echnology o he compu a ional p ocess is no es ic ed,
usually a p og am w i en in Py hon is used on wo ke s.
VO-CLOUD deploymen example is desc ibed by he deploymen diag am
in Figu e 1.1. In his example he e is only one wo ke machine (one Ja a
Applica ion Se e ) whe e wo di e en applica ions a e deployed. The i s
can dispa ch compu a ions o Random Decision Fo es s me hod bina ies, he
second one can dispa ch o Sel -o ganizing map me hod bina ies. Communica-
ion be ween wo ke s and mas e se e is main ained ia specialized Uni e sal
Wo ke Se ice p o ocol (UWS) which is desc ibed in he u he sec ion o
his chap e .
1.2 Technologies
Bo h mas e se e and applica ions s a ing compu a ional p ocess on wo k-
e s a e implemen ed in Ja a EE P og amming Language Pla o m.
4
1.2. Technologies
”The aim o he Ja a EE pla o m is o p o ide de elope s wi h
a powe ul se o APIs while sho ening de elopmen ime, edu-
cing applica ion complexi y, and imp o ing applica ion pe o m-
ance.” [5]
Mas e se e i sel uses signi ican amoun o echnologies speci ied in he
Ja a EE pla o m speci ica ion.
1.2.1 Ja a Pe sis ence API
The e is nume ous in o ma ion on mas e se e ha is equi ed o be s o ed
in he da abase such as use accoun s, lis o a ailable wo ke s, his o y o
expe imen s execu ions and many mo e. Ja a EE p o ides API called Ja a
Pe sis ence API which allows o au oma ically map Ja a objec s o he ela-
ional da abase such as MySQL o Pos g eSQL.
”The Ja a Pe sis ence API (JPA) is a Ja a s anda ds–based solu-
ion o pe sis ence. Pe sis ence uses an objec / ela ional mapping
app oach o b idge he gap be ween an objec -o ien ed model and
a ela ional da abase.” [5]
Ja a objec s ha should be mapped in o he ela ional da abase a e in he
con ex o Ja a Pe sis ence API called En i y classes. These En i y classes
a e mapped in o ables in he da abase and ins ance a iables a e mapped as
columns o hese ables. Whole En i y class and i s ins ance a iables can be
anno a ed by special JPA anno a ions o achie e demanded beha iou o he
objec / ela ional mapping, e.g., changing name o he columns, adding da a-
base cons ain s and so on. I is also possible o pu se ings o con igu a ion
ile ins ead o anno a ing Ja a objec s bu he anno a ion way seems o be
mo e in ui i e.
1.2.2 Ja aSe e Faces
Ja aSe e Faces is he main echnology used on he mas e se e o commu-
nica e wi h a use h ough he web in e ace. I is a se e -side componen
amewo k o building Ja a echnology–based web applica ions [5].
”One o he g ea es ad an ages o Ja aSe e Faces echnology is
ha i o e s a clean sepa a ion be ween beha iou and p esen a-
ion o web applica ions.” [5]
Sou ce code o he p esen a ion ie in Ja aSe e Faces echnology is di ided
o XHTML pages and Managed Beans. Each XHTML ile ep esen s isual
side o one page in he s anda dized o ma XML [6]. The e a e many ags
ha can be used inside hese iles. Whe eas s anda d HTML ags a e di ec ly
used as ou pu o a use , he ile is mos ly composed o special JSF ags wi h
5

1. Analysis o he cu en solu ion
special meaning. These ags add unc ionali y beyond s a ic HTML pages
and hey allow o bind da a changes, ac ions and e en s o he page o Ja a
me hods speci ied in Managed Beans using a special syn ax called Exp ession
Language.
Managed Bean is he special ype o Ja a class. By he JSF speci ica-
ion [5] Ja a classes used as Managed Bean mus ha e de ined non-pa ame ic
cons uc o o be able o dynamically ins an ia e hem h ough he Ja a Re-
lec ion API [7]. Classes mus also ha e speci ied name ha will be used o
iden i ica ion o he Managed Bean in he Exp ession Language in XHTML
iles. Also, classes equi e o ha e de ined scope. ”Scope de ines how applic-
a ion da a pe sis s and is sha ed.” [5] The mos commonly used scopes in
Ja aSe e Faces applica ions a e Reques and Session scopes. Reques scope
s o es da a only du ing a single HTTP eques , whe eas Session scope s o es
ac oss mul iple HTTP eques s and i is always bound o he speci ic use [5].
Bo h Exp ession Language name mapping and scope could be speci ied ei he
wi h Ja a anno a ions o in Ja aSe e Faces con igu a ion XML ile. While
JSF XHTML iles desc ibe mos ly isual side o he page ende ed o he
use , Managed Ja a Beans de ine p ope ies and unc ions o UI componen s
desc ibed by he XHTML pages.
1.2.3 Ja a Se le Technology
Ja a Se le Technology is de ined in he Ja a EE pla o m speci ica ion and
i is used on bo h mas e se e and wo ke s. The Ja a EE speci ica ion says:
”A se le is a Ja a p og amming language class used o ex end
he capabili ies o se e s ha hos applica ions accessed by means
o a eques - esponse p og amming model. Al hough se le s can
espond o any ype o eques , hey a e commonly used o ex end
he applica ions hos ed by web se e s. Fo such applica ions, Ja a
Se le echnology de ines HTTP-speci ic se le classes.” [5]
In he implemen a ion o VO-CLOUD sys em only HTTP se le s a e used.
To implemen such a se le i is necessa y o ex end Ja a class H pSe le
placed in ja ax.se le .h p package. E e y HTTP eques aiming he se -
le is dispa ched o one o he se le ’s inhe i ed me hod depending on he
HTTP me hod used in he clien ’s eques . Fo example HTTP POST me hod
is dispa ched o doPos se le me hod, HTTP GET o doGe me hod and
so on. These se le me hods can be simply o e idden in he H pSe le
subclass o achie e desi ed unc ionali y. Lis o all possible HTTP e sion 1.1
me hods and hei explana ion is desc ibed in RFC 2616 [8]. Finally, he se -
le mus be egis e ed o he demanded con ex pa h o he esul ing web ap-
plica ion. E.g., i he se le was mapped o he pa h / iles/image.jpg and
he web applica ion was deployed on he pa h h p://localhos / ocloud,
HTTP eques wi h me hod GET o URL add ess h p://localhos /
6
1.2. Technologies
ocloud/ iles/image.jpg would be dispa ched o he se le ’s me hod doGe .
Regis a ion can be done ei he wi h Ja a anno a ion o in XML con igu a ion
ile.
1.2.4 En e p ise Ja a Beans
En e p ise Ja a Bean (EJB) is a powe ul echnology and i is pa o he
speci ica ion o Ja a EE pla o m. EJB is a se e -side componen ha en-
capsula es he business logic o he applica ion, i.e., i con ains he code ha
ul ils he pu pose o he applica ion [5]. The e a e many bene i s o using
EJB in he applica ion, such as au oma ic ansac ion managemen , concu -
ency managemen and secu i y au ho iza ion. Mo eo e , EJB echnology
p o ides API o asynch onous me hod in oca ion and possibili y o schedule
se e -side ac i i ies in desi ed imes.
En e p ise Beans un in he EJB con aine , a un ime en i onmen wi hin
a complian applica ion se e . Mas e se e uses he EJB echnology and
he e o e i is necessa y o deploy he mas e se e applica ion o an applic-
a ion se e suppo ing EJB such as GlassFish Se e 1o WildFly Se e 2.
Wo ke applica ion can be deployed on hese se e s oo, ne e heless, hanks
o he ac ha i does no use EJB echnology bu only Ja a Se le Tech-
nology, i can be deployed o applica ion se e s wi hou he EJB con aine
such as Apache Tomca 3.
En e p ise Beans can be di ided o wo main ollowing ypes:
•Session Beans
•Message-d i en Beans
Session bean’s main ask is o encapsula e business logic ha can be in oked
p og amma ically by calling i s me hods [5]. These asks dispa ched o session
bean by clien a e hen execu ed on he se e side inside he EJB con aine
and so clien is shielded om complexi y o he business me hods. The e
a e many ways how a clien can in oke EJB me hod o a session bean. Fo
example, me hods can be in oked emo ely by using Ja a Remo e Me hod
In oca ion echnology.
”The Ja a Remo e Me hod In oca ion (RMI) sys em allows an
objec unning in one Ja a i ual machine o in oke me hods on
an objec unning in ano he Ja a i ual machine. RMI p o ides
o emo e communica ion be ween p og ams w i en in he Ja a
p og amming language.” [9]
1h ps://glass ish.ja a.ne /
2h p://wild ly.o g/
3h p:// omca .apache.o g/
7
1. Analysis o he cu en solu ion
This way o in oca ion could be used i he g aphical use in e ace would
be implemen ed as a Ja a desk op applica ion and no a web applica ion.
In he mas e se e me hods o EJB session beans a e in oked locally om
Managed Beans o he Ja aSe e Faces amewo k. Managed JSF Beans use
dependency injec ion echnique o acqui e ins ances o EJB Session Beans.
The e a e h ee ypes o En e p ise Session Beans.
•S a e ul Session Beans a e e y simila o Session scope de ined in JSF
speci ica ion. Ins ance a iables o s a e ul session bean a e always
bounded o a unique clien ha is using hem. Ne e heless, in he case
o he mas e se e applica ion, he Managed JSF Bean, in o which he
s a e ul session bean is injec ed, is conside ed as he clien . Li e ime o
such a s a e ul bean is de e mined by he scope o Managed Bean ha
he s a e ul bean is injec ed in o. S a e ul session beans a e no much
use ul in he web applica ions since JSF p o ides possibili y o keep
in o ma ion abou use s’ sessions in Session scope anno a ed Managed
Beans.
•S a eless Session Beans a e no bound o a speci ic clien . EJB con-
aine c ea es a pool o a ew s a eless bean objec s and when a clien
needs o in oke me hod, one s a eless bean objec is pulled ou o a pool
and o e ed o he clien . When message in oca ion ends he s a eless
bean objec is e u ned back o he con aine ’s pool. Fo he subsequen
me hod call he con aine can o e di e en ins ance o s a eless bean
and so i is no gua an eed ha ins ance a iables will be kep . ”Excep
du ing me hod in oca ion, all ins ances o a s a eless bean a e equi al-
en , allowing he EJB con aine o assign an ins ance o any clien .” [5]
Wi hou any con igu a ion e e y EJB me hod in oca ion is au oma ic-
ally w apped in ansac ion and so a s a eless session bean is o en used
o s o ing da a o a da abase h ough Ja a Pe sis ence API.
•Single on Session Beans we e in oduced in he EJB speci ica ion e sion
3.1. They a e ins an ia ed only once pe applica ion and exis s o he
whole li ecycle o he applica ion. [5] They a e used in si ua ions whe e
i is necessa y o sha e he same in o ma ion among mul iple clien s.
Also, hey a e o en used in conjunc ion wi h ime se ice in e ace
o compel EJB con aine o in oke a single on’s me hod in eques ed
ime poin . Figu e 1.2 shows simple example o single on session bean
wi h me hod doSomeWo k ha is in oked by EJB con aine e e y hi y
seconds. P ac ically, he mas e se e uses single on ime se ice o
pe iodically check expe imen s unning on wo ke s o see i hey a e
al eady comple ed.
Message-d i en Beans a e special kind o en e p ise beans ha allow Ja a
EE applica ion o p ocess messages asynch onously. They use Ja a Message
8
1.2. Technologies
impo ja ax . ejb . Single on ;
impo ja ax . ejb . Schedule ;
impo ja ax . ejb . S a up ;
@S a up @Single on
public class Schedule Bean {
@Schedule ( second = ”∗/30” , minu e = ”∗” , hou = ”∗” ,
pe sis en = al s e )
public oid doSomeWo k() {
// c a l l e d e e y 30 seconds
}
}
Figu e 1.2: Example o Single on EJB using ime se ice
Se ice API (JMS), a Ja a API ha allows applica ions o c ea e, send, e-
cei e, and ead messages using eliable, asynch onous, loosely coupled commu-
nica ion [5]. Message-d i en bean simply ac s as a JMS message lis ene whe e
he sou ce o messages can be any applica ion capable o c ea ing and sending
JMS message o a message queue c ea ed by an applica ion se e . Ne e -
heless, he mas e se e does no use message-d i en beans. Asynch onous
me hod in oca ion in he mas e se e is pe o med in session beans by using
special me hod anno a ion @Asynch onous in oduced in EJB 3.1. EJB con-
aine au oma ically calls hem in he sepa a ed h ead ins ead o using main
in oca ion h ead.
1.2.5 Uni e sal Wo ke Se ice
The Uni e sal Wo ke Se ice (UWS) is IVOA ecommenda ion ha de ines
how o manage asynch onous execu ion o jobs on a se ice [10]. The In e na-
ional Vi ual Obse a o y Alliance (IVOA) is an o ganisa ion ha ocuses on
he de elopmen o s anda ds and ecommenda ions ha a e needed o make
Vi ual Obse a o y sys em possible.
Simple web se ices a e synch onous and s a eless. Synch onous means
ha clien wai s o he end o execu ion o he eques . I he clien discon-
nec s du ing execu ion om he se ice p o ide he e is no eason o con inue
in he execu ion and he ac i i y is abandoned. S a eless se ice means ha
se ice does no emembe esul s o a p e ious ac i i y [10]. Synch onous
s a eless se ices wo k well when ollowing wo c i e ia apply.
1. The se ice ac i i y du a ion is sho enough o clien o main ain he
connec ion wi h he se ice.
9
2. Requi emen s analysis
SSA p o ocol is wi h he HTTP pa ame e REQUEST=que yDa a – his ep es-
en s ope a ion ha e u ns a able in VOTable o ma desc ibing candida e
da ase s which can be e ie ed, including s anda d me ada a desc ibing each
da ase , and an access e e ence which can be used o e ie e he da a [13].
h p:// os2.asu.cas.cz/ccd700/q/ssa/ssap.xml
?REQUEST=que yDa a&POS=2.67,56.89&SIZE=2
e u ns s uc u ed documen in VOTable o ma desc ibing as onomical spec-
a ound in CCD700 VO a chi e wi h a ”cone sea ch” de ined by a posi ion
(POS pa ame e ) and a adius (SIZE pa ame e ).
I is impo an o no e ha SSAP que yDa a me hod does no e u n he
spec al da a by i sel bu only disco e s he spec a ma ched by a combina ion
o SSAP pa ame e s and e u ns me ada a and in o ma ion abou how o
ob ain hem. The e a e usually wo me hods o ob ain pa icula spec um
de ined in VOTable e u ned by SSAP que y.
•Access Re e ence – I is manda o y column in VOTable e u ned by
SSAP que y con aining URL add ess whe e he equi ed da ase can be
di ec ly downloaded.
•Da aLink – I is he specialized p o ocol ha is explained in he nex
sec ion.
2.1.3 Da aLink
The Da aLink is he IVOA ecommenda ion o p o ocol ha is in close ela-
ionship wi h he SSA p o ocol.
”I s speci i y is o p o ide a binding mechanism and me ada a
s uc u e necessa y o desc ibe connec ed da ase s o seconda y
da a o independan da ase s disco e ed in p e ious VO ope a-
ions.” [14]
Da aLink p o ides a sui able al e na i e o ob aining da ase s o spec a dis-
co e ed by he SSAP que y, howe e i p o ides possibili y o de ine addi ional
pa ame e s ha can adjus da a ecei ed om he se ice. Pa ame e s can
o example in luence he esul o ma o da ase o hey can in oke p ep o-
cessing ac ion like, e.g., spec um no maliza ion o cu o selec ed spec al
lines. The way o an in oca ion o he Da aLink p o ocol is e y simila o
SSAP – i uses HTTP GET-based in e ace o submi pa ame ized eques
on he Da aLink esou ce URL.
To ob ain desi ed da ase i is necessa y o iden i y i i s . The iden i ie
o a Da aLink p o ocol is a column in a VOTable e u ned by a SSAP called
Publishe DID. The ea e i is necessa y o ind ou he Da aLink’s esou ce
URL and pa ame e s ha Da aLink suppo s and hei suppo ed alues. This
16

2.2. Func ional equi emen s
in o ma ion can be ound a he end o VOTable e u ned by he SSAP que y
in case ha a Da aLink p o ocol is suppo ed by a pa icula VO a chi e.
Howe e , nowadays he e a e only a ew VO a chi es suppo ing Da aLink
p o ocols. In he es o hem he e is no o he op ion han o use di ec
download me hod wi h Access Re e ence o ob ain da ase s om VOTables
acqui ed h ough SSAP que y.
2.2 Func ional equi emen s
Func ional Requi emen o he VO-CLOUD sys em can be di ided o nume -
ous sec ions.
2.2.1 Gene al unc ional equi emen s
FR 1 Clien mus be able o communica e wi h he applica ion h ough
a web b owse suppo ing HTML and Ja aSc ip echnologies.
FR 2 Communica ion be ween he applica ion and he clien is media ed
by he HTTP p o ocol.
FR 3 HTTPS – ex ension o HTTP p o ocol o enc yp ed communica ion
– i is no suppo ed in his e sion o VO-CLOUD.
FR 4 E e y clien using VO-CLOUD mus ha e a possibili y o c ea e his
use accoun and log in o i .
FR 5 The in o ma ion abou egis e ed use s mus be s o ed in he da a-
base. Fo secu i y easons use ’s passwo ds mus be hashed wi h
SHA-256 hashing algo i hm.
FR 6 Applica ion mus o e unc ionali y o ese o go en use ’s pass-
wo d and send he new one o his e-mail add ess.
FR 7 Logged use mus ha e abili y o change his passwo d.
FR 8 Use accoun s mus be di ided o h ee di e en g oups
•USER
•MANAGER
•ADMIN
FR 9 Use logged as ADMIN mus ha e possibili y o adminis e all e-
gis e ed use accoun s and he mus be able o change he g oup o
use ’s accoun and o he use p ope ies.
FR 10 Use logged as ADMIN mus ha e possibili y o adminis e a ailable
wo ke s and compu a ion ypes, se hei a ibu es and disable hem
i necessa y.
17
2. Requi emen s analysis
2.2.2 VO-CLOUD s o age unc ional equi emen s
New e sion o he VO-CLOUD sys em mus p o ide s o age whe e he da a
o upcoming expe imen s can be p epa ed ins ead o uploading hem du ing
he new job c ea ion. Managemen o he s o age is ese ed o clien s logged
in wi h use g oup MANAGER o ADMIN. S anda d use accoun s wi h g oup
USER ha e ead-only access o his s o age and hey a e no allowed o modi y
iles sa ed in he s o age by any way.
The Use in he con ex o ollowing unc ional equi emen s is conside ed
as he clien logged in wi h any use g oup (USER, MANAGER, ADMIN).
FR 11 Use mus be able o lis all di ec o ies and iles ha a e s o ed in
he VO-CLOUD s o age. Manda o y a ibu es ha use mus be
able o see a e names o iles and di ec o ies, size o iles and las
modi ica ion ime o iles.
FR 12 Use mus be able o na iga e h ough he di ec o y s uc u e o see
iles and di ec o ies ha a e nes ed inside di ec o ies.
FR 13 Use mus be able o download any chosen ile o i s local compu e
s o age.
The Manage in he con ex o ollowing unc ional equi emen s is con-
side ed as he clien logged in wi h he use g oup MANAGER o ADMIN.
FR 14 Manage mus be able o c ea e new di ec o y wi h speci ied name
in he chosen di ec o y.
FR 15 Manage mus be able o ename a di ec o y o a ile.
FR 16 Manage mus be able o dele e a ile o a di ec o y ecu si ely.
FR 17 Manage mus ha e possibili y o di ec ly upload iles om his local
compu e s o age o he a ge di ec o y.
FR 18 Manage mus be able o ini ia e download om emo e esou ce.
As emo e esou ce is conside ed ollowing:
•a HTTP URL add ess o di ec ly downloadable ile
•a HTTP URL add ess o a emo e olde ha is o be ecu s-
i ely downloaded h ough HTTP p o ocol (In o de o wo k his
ea u e co ec ly he HTTP me hod GET called o he esou ce
add ess mus e u n lis o links o i s subdi ec o ies and iles.
This is he s anda d ea u e o majo i y o HTTP se e s called
di ec o y index lis ing.)
•a FTP URL add ess o di ec ly downloadable ile
18
2.2. Func ional equi emen s
•a FTP URL add ess o a olde ha is o be ecu si ely down-
loaded
FR 19 Manage mus be able o ini ia e download o spec a om VO
a chi e. S eps desc ibing he use case o his ea u e is desc ibed
in he ollowing s eps:
1. Manage di ec ly uploads VOTable ile wi h he desi ed spec a
me ada a o he speci ies SSAP URL add ess, whe e he deman-
ded VOTable can be que ied.
2. VOTable is pa sed by he se e applica ion and in o ma ion
abou que y s a us and spec a coun a e displayed o manage .
3. I he Da aLink p o ocol is suppo ed by he VO a chi e, in-
o ma ion abou i s pa ame e s and possible alues a e pa sed
om he VOTable and dynamically isualised o he manage .
4. Manage can decide whe he he download me hod will be ex-
ecu ed h ough di ec Access Re e ence alue o he Da aLink
p o ocol (i suppo ed). I he Da aLink is chosen manage can
se dynamically isualised que y pa ame e s.
5. Finally manage submi s eques and he download ask is ini-
ia ed.
FR 20 Manage mus be able o iew his o y o emo e downloads and SSAP
downloads. The his o y mus con ain he ollowing p ope ies:
•S a e – cu en s a us o he download ask (possible s a es a e
CREATED, RUNNING, COMPLETED, FAILED)
•C ea ion ime
•Finish ime
•Ta ge di ec o y – he di ec o y in he VO-CLOUD s o age
whe e downloaded iles shall be sa ed
•Download URL – add ess o he downloadable esou ce
•Download log – he log mos ly con aining in o ma ion abou
download e o s
FR 21 The HTTP, FTP and SSAP downloading se ice will no suppo
esou ces equi ing au ho iza ion in his e sion.
2.2.3 Job managemen unc ional equi emen s
FR 22 E e y logged use mus be able o c ea e new expe imen compu a-
ion (in his con ex called job). Sys em mus dynamically gene a e
lis o all job ypes ha a e a ailable o he logged use , i.e., he
19
2. Requi emen s analysis
use mus no ha e op ion o c ea e new job o he compu a ion ype
ha has no a ailable wo ke s.
FR 23 Job ypes mus be di ided o wo main sec ions.
•S anda d job ypes – jobs a ailable o all logged use s
•Res ic ed job ypes – jobs a ailable only o manage s, i.e., cli-
en s logged wi h use g oup MANAGER o ADMIN
Res ic ed job ypes mus be in isible o non-manage s. Adminis-
a o use s mus ha e possibili y o se he job ype es ic ion in
he job adminis a ion page.
FR 24 A e he use chooses desi ed job ype a job c ea ion window mus
be displayed. The window mus ha e possibili y o se he ollowing
in o ma ion.
•P ojec label – label o he newly c ea ed job
•Desc ip ion – op ional desc ip ion o he job
•Email esul s – op ion o send esul s o he use ’s email add ess
a e job comple ion
•Con igu a ion JSON ile – Con igu a ion ile in he JSON o ma
ha is o be used as inpu o wo ke . Con igu a ion can be
uploaded and edi ed in a ex edi o o he page. Con igu a ion
JSON ile also con ains in o ma ion abou he iles ha mus be
downloaded om he VO-CLOUD mas e se e o wo ke and
ha a e used as he sou ce o a compu a ion.
FR 25 The job c ea ion page mus p o ide wo op ions o job sa ing. The
i s one only sa es he job and sends i o he dedica ed wo ke .
The second op ion does he same bu mo eo e sends PHASE=RUN
pa ame e o he wo ke o se he job in o QUEUED phase.
FR 26 I a use is logged in as a manage (use g oup MANAGER o AD-
MIN) he use mus op ionally be able o speci y a olde in he
VO-CLOUD s o age whe e he esul s o he newly c ea ed job shall
be copied i he job success ully inishes in phase COMPLETE.
FR 27 The in o ma ion abou c ea ed jobs mus be able o be shown in he
specialized page. E e y use wi h he excep ion o adminis a o s
mus be able o see only his own c ea ed jobs. Adminis a o use s
mus mo eo e ha e possibili y o see jobs o all use s. The job lis
page mus ha e he ollowing in o ma ion abou jobs.
•Job ype
20
2.3. Non- unc ional equi emen s
•Iden i ie – unique iden i ie o he job and i s owning use
•Job label – p ojec label p ope y o he job
•C ea ion ime
•Du a ion – du a ion o he job’s execu ion
•Phase – execu ion phase o he job
The job lis page mus also p o ide in e ace o in oke ollowing job
ope a ions.
•s a a job in phase PENDING
•abo a unning job
•comple ely dele e a job wi h possible esul s om VO-CLOUD
•show new page wi h addi ional job de ails
FR 28 The page showing de ails abou a job mus con ain he same in o m-
a ion ha a e p esen in he job lis page. Mo eo e i mus con ain
a bu on Run again which na iga es o he c ea e job page whe e
inpu ields a e p e- illed wi h he in o ma ion om he sou ce job.
FR 29 The job de ails page mus isually ep esen a di ec o y s uc u e o
downloaded job’s esul s. Any ile in his s uc u e mus be down-
loadable. Any ex ual ile mus be iewable di ec ly on he page.
FR 30 I job esul s con ains images (PNG, JPG o GIF) in he oo olde o
in he olde esul o esul s he images mus be di ec ly ende ed
on he job de ails page.
FR 31 I job esul s con ains HTML pages (HTM o HTML ex ension) in
he oo olde o in he olde esul o esul s he pages mus be
shown di ec ly on he job de ails page as he HTML pages nes ed in
i ame HTML elemen .
FR 32 I a use is logged in as a manage (use g oup MANAGER o AD-
MIN) and he job is in phase COMPLETED he use mus ha e
possibili y on he job de ails page o copy job’s esul s o he spe-
ci ied olde in he VO-CLOUD s o age.
2.3 Non- unc ional equi emen s
NFR 1 The wo ke applica ion mus be edesigned o allow quick deploy-
men package c ea ion and deploymen .
NFR 2 The sys em mus be able o eco e when one o i s wo ke is dis-
connec ed o when he connec ion be ween he se e and a emo e
se e is los du ing iles downloading.
21

2. Requi emen s analysis
NFR 3 The mas e se e ’s clien o SSAP and Da aLink p o ocols mus
be able o communica e wi h di e en ypes o VO a chi e se e s.
NFR 4 The mas e se e and i s wo ke s mus be able o un on one single
applica ion se e as well as dis ibu ed on se e al applica ion se -
e s on di e en machines.
NFR 5 Sou ce codes o he VO-CLOUD sys em mus be published unde
he Open Sou ce license and publicly a ailable on a public eposi -
o y.
NFR 6 Applica ions mus be compa ible wi h all applica ion se e ypes
suppo ing necessa y echnologies and deployable on di e en pla -
o ms.
NFR 7 Use s mus be shielded om each o he . One use mus no be able
oo ge inpu da a and esul s o job o ano he use by any way
(wi h he excep ion o adminis a o ).
22
Chap e 3
Wo ke ealisa ion
The e a e a ew changes ha had o be done in he wo ke design and imple-
men a ion in o de o p ese e same unc ionali y o VO-CLOUD sys em and
o make a wo ke deploymen easie . These changes a e in de ail explained in
he ollowing sec ions o his chap e .
3.1 Uni e sal wo ke concep
As explained be o e he wo ke compu ing node consis s o execu able bin-
a y iles and he se le based applica ion deployed on some Ja a applica ion
se e (see Sec ion 1.1). The p oblem is ha o e e y new compu a ional
me hod, i.e., new execu able compu a ional applica ion, i is necessa y o c e-
a e new se le based applica ion whe e he s eps o an execu able applica ion
in oca ion a e de ined in a sou ce code. Mo eo e e e y compu ing node whe e
he wo ke is deployed could ha e di e en pa ame e s o example he pa h
o execu able compu a ional applica ion could di e . These pa ame e s a e
speci ied in he esou ce con igu a ion ile which has o be buil oge he wi h
compiled sou ce codes. Imagine example whe e he e a e wo di e en com-
pu a ional applica ions on h ee compu ing nodes. In his case i would be
necessa y o c ea e wo di e en implemen a ions o se le based applica ion
and oge he o build six packages ha ha e o be deployed on he co ec
applica ion se e .
A uni e sal wo ke is a new ype o he se le based applica ion ha is
used ins ead o all o he wo ke applica ion ypes. The idea is o deploy only
one ins ance o uni e sal wo ke applica ion on one compu e wo ke node
whe e mul iple compu a ional execu able applica ions a e suppo ed. This
me hod has many bene i s:
•The deploymen on he applica ion se e s is easie .
•An applica ion se e uses less esou ces.
23
3. Wo ke ealisa ion
•Cons ain s on he execu ion queue such as maximum concu en un-
ning jobs a e now applied on he whole compu ing node and no only
on compu a ional applica ion. This is mo e logical solu ion because a
compu ing node is usually limi ed as he whole pa .
•The capabili ies o UWS p o ocol a e ully u ilized. The o iginal solu ion
used only one job lis esou ce pe applica ion.
•Add essing o he UWS esou ces a e mo e in ui i e. Fo example com-
pa e wo add esses o he o iginal solu ion
h p://localhos / d /uws/
h p://localhos /som/uws/
wi h he new solu ion
h p://localhos /uni e sal/uws/ d /
h p://localhos /uni e sal/uws/som/
The mas e se e applica ion can now hold in o ma ion abou a ail-
able compu ing nodes and o e e y node lis o possible compu a ional
me hods ha he node is capable o doing. A URL add ess o he UWS
esou ce can be cons uc ed o a URL add ess o he compu ing node
and he compu a ional me hod name.
Uni e sal wo ke is con igu ed h ough he XML documen ma ching XSD
schema specially c ea ed o his pu pose. Figu e 3.1 shows example o such
XML con igu a ion ile whe e he compu a ional me hod RDF is in ol ed. O
cou se he e can be many mo e <ns:wo ke > ags inside <ns:wo ke s> ags in
o de o desc ibe mo e compu a ional me hods. Tags <ns:exec-command> and
<ns:command> a e used o desc ibe he way how o s a he compu a ional
p ocess.
The con igu a ion ile is in he new e sion o VO-CLOUD se e packed
in a deploymen a chi e as a esou ce ile. Fo each compu ing node i is hen
necessa y o build i s own deploymen applica ion wi h he igh con igu a ion
ile. Howe e , he uni e sal wo ke is designed o be able o download con-
igu a ion ile om dedica ed emo e eposi o y o example om he mas e
se e . In his case i would be only necessa y o deploy he same uni e sal
wo ke applica ion on each compu ing node’s applica ion se e and o pu all
con igu a ion iles in o one place in he mas e se e . E e y uni e sal wo ke
would du ing i s ini ializa ion download i s con igu a ion ile om he mas e
se e h ough specialized in e ace and i would no be necessa y o admin-
is a o o con igu e wo ke s on he mas e se e because i al eady knows
he in o ma ion. This ea u e is going o be implemen ed in he nex e sion
o VO-CLOUD sys em.
24
3.1. Uni e sal wo ke concep
<?xml e sion="1.0" encoding="u -8"?>
<ns:uws-se ings
xmlns:xsi=’h p://www.w3.o g/2001/XMLSchema-ins ance’
xmlns:ns=’h p:// ocloud. k.cz/schema’
xsi:schemaLoca ion=’h p:// ocloud. k.cz/schema con igSchema.
xsd’>
<ns: ocloud-se e -add ess>h p://localhos / ocloud2</ns:
ocloud-se e -add ess>
<ns:local-add ess>h p://localhos /uni e sal/uws</ns:local-
add ess>
<ns:max-jobs>2</ns:max-jobs>
<ns:desc ip ion>Uni e sal UWS wo ke </ns:desc ip ion>
<ns:wo ke s>
<ns:wo ke >
<ns:iden i ie > d </ns:iden i ie >
<ns:desc ip ion>RDF</ns:desc ip ion>
<ns: es ic ed> alse</ns: es ic ed>
<ns:bina ies-loca ion>/home/ oadmin/RDF</ns:bina ies-
loca ion>
<ns:exec-command>
<ns:command>py hon3</ns:command>
<ns:command>
${bina ies-loca ion}/ unRF.py
</ns:command>
<ns:command>${con ig- ile}</ns:command>
</ns:exec-command>
</ns:wo ke >
</ns:wo ke s>
</ns:uws-se ings>
Figu e 3.1: Uni e sal wo ke con igu a ion ile
25
4. Mas e se e ealisa ion
Filesys emManipula o bean can ha e de ined secu i y cons ain s on i s
me hods. Fo ins ance he me hod dele eFileRecu si ely ha is capable
o ecu si e dele ion o a whole di ec o y can be anno a ed wi h he secu -
i y cons ain o be a ailable only o manage s and adminis a o s and no
o common use s. The au ho iza ion checking is hen no only done in he
p esen a ion ie o he applica ion bu mo eo e in he business ie and he
applica ion is hen po en ially mo e secu e.
4.2 Remo e download ea u e
Adminis a o and manage s ha e possibili y o c ea e a new ask ha will
download desi ed iles om he emo e esou ce (HTTP o FTP se e ).
This ask is in he con ex o he VO-CLOUD applica ion called download
job. Two specialized EJB S a eless beans we e implemen ed o his ea u e
DownloadManage and DownloadP ocesso . The DownloadManage p o ides
me hod enqueueNewURLDownload o c ea ing a new download job. This
me hod s o es in o ma ion abou he download job o he da abase and i
asynch onously in okes a me hod om he DownloadP ocesso bean whe e
he download i sel is ini ia ed. The e o e, he DownloadManage bean only
p epa es he download job o asynch onous execu ion in DownloadP ocesso
bean.
The g ea ad an age o he emo e download ea u e is ha i suppo s
ecu si e downloads o whole di ec o ies. In he FTP p o ocol his is ela -
i ely easy. FTP p o ides command o lis di ec o ies and iles in he cu en
wo king di ec o y. The p oblem is wi h di ec o ies downloading in he HTTP
p o ocol. Ta ge ed HTTP se e whe e he URL add ess ep esen s a di ec -
o y o be downloaded mus ha e allowed ea u e called index di ec o y lis ing,
i.e., HTTP se e mus e u n lis o URL links poin ing o di ec o y’s iles
and subdi ec o ies. Example o such a page can be seen in Figu e 4.1. This
di ec o y lis ing page can be pa sed by he DownloadP ocesso bean and used
o c awling h ough he di ec o y ee s uc u e o download whole a ge ed
di ec o y ecu si ely.
4.3 SSAP and Da aLink clien
The mas e se e p o ides possibili y o manage s and adminis a o s o
download da a di ec ly om he VO a chi es by using SSAP and Da aLink
p o ocols. A i s i is necessa y o ge he VOTable ep esen ing he lis
o spec a o be downloaded. I is simple o ge he VOTable. Use ei he
di ec ly uploads he VOTable o he mas e se e o he speci ies he URL
add ess o SSAP esou ce whe e he VOTable can be downloaded. Now he
VOTable mus be pa sed. O iginally I wan ed o use Ja a A chi ec u e o
XML Binding echnology (JAXB) in he implemen a ion o di ec ly con e
32

4.3. SSAP and Da aLink clien
Figu e 4.1: Page wi h a di ec o y lis ing
VOTable s uc u e in o he se o Ja a mapped objec s. Howe e , he p oblem
is ha he e a e di e en e sions o VOTable XML documen and i would
be complica ed o c ea e a new se o Ja a objec s o each e sion o he
VOTable schema. Finally, he Simple API o XML (SAX) p inciple o XML
pa sing was in ol ed in he VOTable p ocessing. The SAX pa se simply
goes g adually h ough he VOTable ile and i calls me hod when one o he
ollowing e en s happen:
•an XML opening elemen was ound
•an XML closing elemen was ound
•cha ac e s be ween XML elemen s we e ound
SAX pa sing me hod is po en ially many imes as e han JAXB because i
does no ha e o con e all elemen s o hei Ja a objec s coun e pa s bu
ins ead i uns once h ough he XML ile and i emembe s only hings ha
a e necessa y o u he p ocessing.
33
4. Mas e se e ealisa ion
The impo an hing is ha he pa se mus be able o ecognize om a
VOTable i he Da aLink p o ocol is a ailable. I so he pa se mus collec
in o ma ion abou i s possible pa ame e s. The VOTable pa se is a ailable
h ough he in e ace o he class Vo ablePa se . I s me hod e u ns an in-
s ance o class IndexedSSAPVo able which consis s o in o ma ion ha a e
necessa y o downloading spec a h ough Access Re e ence column o pos-
sibly h ough he Da aLink p o ocol (i suppo ed).
The use can now choose i he wan s o p ocess downloading wi h he
usage o Access Re e ence column o wi h he usage o Da aLink p o ocol.
I he chooses he Da aLink p o ocol he can now speci y pa ame e s ha he
Da aLink p o ocol suppo s. The web page whe e he use can speci y he
Da aLink pa ame e s is dynamically c ea ed acco ding o he Da aLink p o-
ocol in o ma ion ha a e p o ided in he pa sed VOTable.
A e submi ing he download ask he new download job is c ea ed
o each spec um speci ied in he VOTable in he DownloadManage S a e-
less bean. The whole lis o spec a is hen asynch onously dispa ched o
DownloadP ocesso bean whe e he download job is being p ocessed.
4.4 P ep ocessing
Usually o some da a ypes and especially o da a de e mined o a da a
mining i is necessa y o do p ep ocessing on hem ( o ins ance o do no mal-
iza ion, ebinning and so on). P ep ocessing is de ined as an ope a ion ha
akes selec ed spec a as an inpu and p oduces a ile o iles ha mus be
sa ed o he VO-CLOUD s o age o be a ailable o common use s.
The idea is o conside he p ep ocessing as a compu a ional me hod ha
can be execu ed on wo ke s. The p ep ocessing would be hen de ined as a
me hod es ic ed only o manage s and adminis a o s. I is also necessa y
o implemen ea u e o mo e he esul s o a job’s compu a ion in o speci ied
a ge olde in he VO-CLOUD s o age. This ea u e is o cou se a ailable
only o manage s and adminis a o s because a common use does no ha e
pe mission o sa e iles in o he VO-CLOUD s o age.
In o de o do p ep ocessing o e downloaded da a he s o age manage
mus c ea e a new p ep ocessing job and in he new job c ea ion page he mus
se he a ge ed olde in he VO-CLOUD s o age. A e job comple ion he
mas e se e au oma ically copies esul iles o he a ge ed VO-CLOUD
s o age di ec o y and he da a can now be used by any common use o a
new expe imen compu a ion.
4.5 Wo ke s managemen
I is absolu ely necessa y o he mas e se e o be in o med wha possible
ypes o compu a ion a e a ailable and which wo ke s suppo s i in o de o
34
4.5. Wo ke s managemen
class JPA En i y Model
Se ializable
Wo ke
- id :In ege
- esou ceU l :S ing
- sho Desc ip ion :S ing
- desc ip ion :S ing
- maxJobs :In ege
Se ializable
UWSType
- id :In ege
- s ingIden i ie :S ing
- sho Desc ip ion :S ing
- desc ip ion :S ing
- documen a ionU l :S ing
- es ic ed :Boolean
Se ializable
UWS
- id :In ege
- enabled :Boolean
0..*
1
0..*
1
Figu e 4.2: Class diag am o JPA En i y classes
e ec i ely dispa ch a job compu a ion o hem. In his e sion o VO-CLOUD
he in o ma ion abou he possible compu a ional ypes and wo ke s mus be
se manually by he adminis a o h ough a specialized web page. Howe e
in he u u e e sions o he VO-CLOUD i is expec ed ha he in o ma ion
would be passed au oma ically by he egis a ion o some newly deployed
wo ke .
JPA En i y classes used o objec / ela ional mapping o a da abase had
o be edesigned in o de o allow he u u e ex endibili y, o allow be e
load balancing me hod o compu a ion dispa ching and o make possible he
dynamic ende ing o possible job compu a ional ypes in he web use in e -
ace. Mo eo e he new edesigned model is mo e logical in he con ex o a
newly c ea ed concep Uni e sal wo ke (see Sec ion 3.1). Class diag am o
he edesigned JPA En i y classes can be seen in Figu e 4.2.
Class Wo ke ep esen s a compu a ional wo ke . E e y compu a ional
wo ke mus ha e de ined esou ce URL add ess whe e he UWS se ice is
a ailable, sho desc ip ion o wo ke o be able o ecognize, maximum num-
be o jobs ha can be s a ed pa allelly on he wo ke and op ionally long
desc ip ion o he wo ke .
Class UWSType ep esen s a compu a ional me hod. The compu a ional
me hod mus ha e de ined s ing iden i ie which is used o name a job lis
35
4. Mas e se e ealisa ion
queue in wo ke s, a sho desc ip ion ha is used as he name o he compu-
a ional me hod in a web use in e ace, a es ic ed lag ha signalizes i he
compu a ional me hod is es ic ed only o manage s and adminis a o s and
inally op ional pa ame e s long desc ip ion and an URL add ess whe e he
compu a ional me hod is documen ed.
Class UWS is a media o o he M:N ela ionship be ween a compu a ional
wo ke and a compu a ional me hod. Mo eo e his class con ains pa ame e
enabled ha allows adminis a o o disable usage o a speci ied compu a-
ional me hod on some speci ied wo ke .
All h ee classes also con ain pa ame e id ha is manda o y by he JPA
speci ica ion and i is used as a p ima y key in a ela ional da abase.
4.6 Jobs load balancing
In o de o maximize h oughpu o he whole dis ibu ed VO-CLOUD sys em
i was necessa y o design an algo i hm ha would assign he newly c ea ed
compu a ional job o he leas loaded wo ke . The algo i hm ollows hese
s eps:
1. Find all compu a ional wo ke s ha a e able o execu e desi ed compu-
a ional me hod.
2. Choose he i s compu a ional wo ke .
3. Find ou how many jobs assigned o his wo ke a e in he execu ion
phase EXECUTING.
4. Find ou how many jobs can be un pa allelly on his wo ke .
5. Coun he di e ence coun o maximum pa allelly unning jobs minus
coun o jobs in he phase EXECUTING.
6. Remembe he di e ence and con inue om he s ep 3. wi h a nex
compu a ional wo ke i he e is such.
7. Find ou he wo ke wi h he g ea es di e ence. I he e a e mo e
wo ke s wi h he same g ea es di e ence choose andomly one.
8. Assign he new job o his chosen compu a ional wo ke .
36
Chap e 5
Fu u e de elopmen
The e a e many ways how o imp o e he VO-CLOUD dis ibu ed sys em.
The mos impo an hing is o make he deploymen o he mas e se e and
o dis ibu ed wo ke s easie . The ollowing imp o emen s in deploymen a e
planned o be done in he u u e e sions:
•Mo e he wo ke XML con igu a ion ile o one place whe e con igu a-
ions could be downloadable by newly deployed wo ke s.
•Add possibili y o wo ke s o egis e hemsel es o he mas e se e
h ough he specialized web se ice. I would no be han necessa y o
adminis a o s o con igu e e e y wo ke manually on he mas e se e .
•Simpli y he ins alla ion o applica ion se e s and execu able compu a-
ional iles by using a i ualiza ion ool such as Docke 4.
The VO-CLOUD s o age is planned o be imp o ed oo.
•Allow manage o selec mul iple iles o di ec o ies and o dele e hem
oge he .
•Allow ope a ions mo e and copy.
•In oduce Simple Applica ion Messaging P o ocol (SAMP) [15] ha
would allow use s o send spec a om he VO-CLOUD s o age di ec ly
o he isualisa ion ool unning on hei compu e such as SPLAT-VO.
•Add o he manage abili y o s op cu en ly unning download job and
o dele e i ems om a download job his o y.
4h ps://www.docke .com/
37

Conclusion
The goal o his hesis has been me . The undamen al concep s and he wo k-
low o he o iginal VO-CLOUD sys em ha e been analysed and he mas e
se e and he uni e sal wo ke pa s o he dis ibu ed VO-CLOUD sys em
ha e been success ully implemen ed. The VO-CLOUD sys em is now able o
download as onomical spec a om VO a chi es by using as onomical p o-
ocols SSAP and Da aLink, o un p ep ocessing on hem, o eed hem o he
compu a ional wo ke s and o isually display he esul s o he compu a ions
o he use s.
The concep o many di e en ypes o compu a ional wo ke s ha e been
edesigned o one uni e sal wo ke and he e o e he p ocess o deploymen
on he compu a ion nodes is simpli ied.
I ha e gained a aluable expe ience du ing he p ocess o designing and he
implemen a ion o he new e sion o VO-CLOUD sys em and I ha e acqui ed
knowledge abou he undamen al concep s o VO echnologies and abou he
as onomy in i s en i e y.
39
Bibliog aphy
[1] Hanisch, R.; Quinn, P. In e na ional Vi ual Obse a o y Alliance
[online]. The IVOA, [ci . 2015-05-04]. A ailable om: h p://
www.i oa.ne /abou /TheIVOA.pd
[2] M k a, L. VO-KOREL, se e o as onomical cloud compu ing. Bach-
elo ’s hesis, Czech Technical Uni e si y in P ague, Facul y o In o ma ion
Technology, P ague, 2012.
[3] Coulou is, G.; Dollimo e, J.; Kindbe g, T.; e al. Dis ibu ed Sys ems:
Concep s and Design (5 h Edi ion). Pea son, 2011, ISBN 0132143011.
[4] Paliˇcka, A. Applica ion o Random Decision Fo es s in As oin o ma -
ics. Bachelo ’s hesis, Czech Technical Uni e si y in P ague, Facul y o
In o ma ion Technology, P ague, 2014.
[5] O acle. Ja a Pla o m, En e p ise Edi ion; The Ja a EE Tu o ial;
Release 7 [online]. Sep embe 2014, [ci . 2014-05-05]. A ailable om:
h ps://docs.o acle.com/ja aee/7/JEETT.pd
[6] WWW Conso cium. Ex ensible Ma kup Language (XML) 1.0 (Fi h Edi-
ion) [online]. No embe 2008, [ci . 2015-05-05]. A ailable om: h p:
//www.w3.o g/TR/REC-xml/REC-xml-20081126- e iew.h ml
[7] O acle. T ail: The Re lec ion API [online]. [ci . 2015-05-06].
A ailable om: h p://docs.o acle.com/ja ase/ u o ial/ e lec /
index.h ml
[8] The In e ne Socie y. Hype ex T ans e P o ocol -- HTTP/1.1 [online].
1999, [ci . 2015-05-07]. A ailable om: h p:// ools.ie .o g/pd /
c2616.pd
[9] O acle. T ail: RMI [online]. [ci . 2015-05-08]. A ailable om: h ps:
//docs.o acle.com/ja ase/ u o ial/ mi/
41
C. Uni e sal wo ke XML con igu a ion ile schema
anyURI"/>
26 <xsd:elemen name="local-add ess" ype="xsd:anyURI"/>
27 <xsd:elemen name="max-jobs" ype="xsd:posi i eIn ege "
de aul ="4"/>
28 <xsd:elemen name="desc ip ion" ype="xsd:s ing"/>
29 <xsd:elemen name="de aul -des uc ion-in e al" ype="
xsd:posi i eIn ege " minOccu s="0"/>
30 <xsd:elemen name="max-des uc ion-in e al" minOccu s="
0" ype="xsd:posi i eIn ege "/>
31 <xsd:elemen name="de aul -execu ion-du a ion" de aul ="
3600" minOccu s="0" ype="xsd:posi i eIn ege "/>
32 <xsd:elemen name="max-execu ion-du a ion" de aul ="3600
"minOccu s="0" ype="xsd:posi i eIn ege "/>
33 <xsd:elemen name="wo ke s">
34 <xsd:complexType>
35 <xsd:sequence>
36 <xsd:elemen name="wo ke " maxOccu s="unbounded"
minOccu s="0" ype=" ns:wo ke "/>
37 </xsd:sequence>
38 </xsd:complexType>
39 </xsd:elemen >
40 </xsd:sequence>
41 </xsd:complexType>
42 </xsd:elemen >
43 </xsd:schema>
48

Appendix D
Mas e se e README ile
Requi emen s
============
-JDK 7+
-Applica ion se e suppo ing Ja a EE 7wi h EJB con aine
suppo (Wild ly,Glass ish, ...)
-Da abase (Pos g eSQL,MySQL, ...)
P oduc ion ins all guide
========================
Fo ins ance I will use Debian amd64 wi h Wild ly 8.2
applica ion se e ,JDK 8and Pos g eSQL 8.4
1. Ins all JDK 8
Download JDK om
h p://www.o acle.com/ echne wo k/ja a/ja ase/downloads/
index.h ml
in zip ile o m, o example jdk-8u45-linux-x64. a .gz
Ex ac a chi e o /us /lib/j m
Se up en i omen a iables o ja a
add hese lines o he end o /e c/p o ile
expo JAVA_HOME=/us /lib/j m/jdk1.8.45
expo PATH=$JAVA_HOME/bin
2. Ins all Wild ly 8.2.0
Download zip om h p://wild ly.o g/downloads/
Ex ac a chi e o he /us /local
In he newly ex ac ed wild ly di ec o y execu e bin/add-use
49
D. Mas e se e README ile
.sh and se up new wild ly adminis e ing use .
3. S a Wild ly by execu ing bin/s andalone.sh
Se e should success ully s a .
I e e y hing wen OK:
Se e is unning on h p://localhos :8080/
Admin console on h p://localhos :9990/
4. Ins all and con igu e Pos g eSQL da abase se e
ap -ge ins all pos g esql
login as pos g es "su -pos g es"and un clien "psql
empla e1"
hen ype ollowing commands o se up da abase o ocloud,
CREATE USER ocloud WITH PASSWORD ’ ocloud’;
CREATE DATABASE ocloud;
GRANT ALL PRIVILEGES ON DATABASE ocloud TO ocloud;
No e:You should eally no use he same passwo d as use name
.Do no o ge o change i !
5. Con igu e da abase esou ce in Wild ly
Log in o Wild ly admin console:h p://localhos :9990/
Type in c eden ials o adminis a ing use
Download JDBC o Pos g eSQL h ps://jdbc.pos g esql.o g/
In he admin console na iga e o Deploymen s
Click Add,selec downloaded JDBC .ja ile and click Ok
Enable newly uploaded JDBC d i e
Na iga e o Con igu a ion ab
Selec Da asou ces
Click Add
Name:VocloudDS
JNDI Name:ja a:jboss/da asou ces/ ocloud
Click Nex
Selec pos g esql jdbc d i e
Click Nex
Connec ion URL: "jdbc:pos g esql://localhos :5432/ ocloud" (
wi hou quo es)
Use name: ocloud
Passwo d: ocloud
50
Click Done
Enable VocloudDS
Da asou ce can be es ed in sec ion Connec ion >Tes
connec ion
Ping should be success ul
6. Con igu e e-mail esou ce in Wild ly
I is necessa y o ha e an email add ess which will se e as
he sou ce o emails.Fo ins ance I will use add ess
ocloud@ ocloud.o g whe e SMTP is unning on po 465 and
he hos add ess o he SMTP se e is sm p. ocloud.o g.
Na iga e o Con igu a ion sec ion
Selec Socke Binding
Click View on s anda d-socke s
Selec Ou bound Remo e sec ion
Click Add
Name: ocloud-sm p
Hos :sm p. ocloud.o g
Po : 465
Click Sa e
Na iga e o Mail subsys em sec ion
Click Add
JNDI Name: "ja a:jboss/mail/ ocloud-mail" (wi hou quo es)
Click View on he newly c ea ed mail session
Click Add
Socke binding: ocloud-sm p
Type:sm p
Use name:use name o he email se e
Passwo d:passwo d o he email se e
Check use SSL (i he po is 465)
Click Sa e
7. Con igu e secu i y in WildFly
Na iga e o Secu i y Domains in Con igu a ion sec ion
Click Add
Name:VocloudSecu i yDomain
Click Sa e
Click View on he newly c ea ed secu i y domain
Click Add
51
D. Mas e se e README ile
Code:Da abase
Flag: equi ed
Click Sa e
Now click on he newly c ea ed Login module
Click on Module Op ions
Add he ollowing key= alue pai s
dsJndiName =ja a:jboss/da asou ces/ ocloud
p incipalsQue y =selec pass om use accoun whe e
use name=?
olesQue y =selec g oupName, ’Roles’ om use accoun
whe e use name=?
hashAlgo i hm =SHA-256
hashEncoding =hex
8. Deploy ocloud.wa o he Wild ly se e
Na iga e o sec ion Deploymen s
Click Add
Selec ocloud.wa ile
Submi
Enable he ocloud.wa
VO-CLOUD should now un on h p://localhos :8080/ ocloud
9. C ea e admin accoun
Using h p://localhos :8080/ ocloud/ egis e .xh ml
Regis e a new accoun wi h use name admin
This accoun ha e now adminis a o p i ileges.
52
Appendix E
Uni e sal wo ke README ile
Requi emen s
============
-JDK 7+
-Ja a applica ion se e suppo ing Ja a se le echnology (
omca ,wild ly, ...)
-Ma en ool (i building is necessa y)
-Execu able compu a ional applica ion o each desi ed
compu a ional ype
Ins all guide
=============
Fo ins ance I will use Debian amd64 wi h Wild ly 8.2
applica ion se e ,JDK 8and Ma en 3.1
1. Ins all JDK 8
Download JDK om
h p://www.o acle.com/ echne wo k/ja a/ja ase/downloads/
index.h ml
in zip ile o m, o example jdk-8u45-linux-x64. a .gz
Ex ac a chi e o /us /lib/j m
Se up en i omen a iables o ja a
add hese lines o he end o /e c/p o ile
expo JAVA_HOME=/us /lib/j m/jdk1.8.45
expo PATH=$JAVA_HOME/bin
2. Ins all Wild ly 8.2.0
53

E. Uni e sal wo ke README ile
Download zip om h p://wild ly.o g/downloads/
Ex ac a chi e o he /us /local
In he newly ex ac ed wild ly di ec o y execu e bin/add-
use .sh and se up new wild ly adminis e ing use .
3. S a Wild ly by execu ing bin/s andalone.sh
Se e should success ully s a .
I e e y hing wen OK:
Se e is unning on h p://localhos :8080/
Admin console on h p://localhos :9990/
4. Con igu e uni e sal-wo ke con igu a ion ile (op ional s ep
i you wan ano he con igu a ion ha i is in p ebuil
a chi e)
Download sou ces o uni e sal-wo ke
Go o s c/main/ esou ces/
Adjus uws-con ig.xml ile
Go back o sou ces oo
Execu e command "m n package"
Wo ke is compiled and he deployable a chi e is c ea ed in
a ge /uni e sal-wo ke .wa
5. Deploy uni e sal wo ke o Wild ly
Open Wild ly admin console on h p://localhos :9990/
Login wi h he c eden ials o adminis a ing use
Na iga e o Deploymen s sec ion
Click Add
Selec deployable uni e sal-wo ke .wa a chi e
Click OK
Enable he newly deployed applica ion
UWS se ice should now un on
h p://localhos :8080/uni e sal-wo ke /uws
No e:This is only desc ip ion o uni e sal-wo ke applica ion
which se es as he media o be ween he mas e se e and
execu able compu a ional applica ion.In o de o make a
wo ke ully unc ional you ha e o se he con igu a ion
ile o he uni e sal-wo ke o poin o he alid loca ions
54
o he execu able compu a ional applica ions.Fo mo e
in o ma ion see he documen a ion o he speci ic execu able
compu a ional applica ion.
55