Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
W
wendelin
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Léo-Paul Géneau
wendelin
Commits
484a14e9
Commit
484a14e9
authored
Sep 25, 2020
by
Roque
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
erp5_wendelin_data_lake_ingestion: script to get size of all data lake and by dataset
parent
07e4bdb7
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
107 additions
and
0 deletions
+107
-0
bt5/erp5_wendelin_data_lake_ingestion/SkinTemplateItem/portal_skins/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.py
...ins/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.py
+45
-0
bt5/erp5_wendelin_data_lake_ingestion/SkinTemplateItem/portal_skins/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.xml
...ns/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.xml
+62
-0
No files found.
bt5/erp5_wendelin_data_lake_ingestion/SkinTemplateItem/portal_skins/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.py
0 → 100644
View file @
484a14e9
import
json
portal
=
context
.
getPortalObject
()
portal_catalog
=
portal
.
portal_catalog
def
getDatasetInfo
(
data_set
):
size
=
0
datastream_result_dict
=
json
.
loads
(
portal
.
ERP5Site_getDataStreamList
(
data_set
.
getReference
()))
for
stream_dict
in
datastream_result_dict
[
'result'
]:
size
+=
stream_dict
[
'full-size'
]
return
len
(
datastream_result_dict
[
'result'
]),
size
def
format_size
(
num
,
suffix
=
'b'
):
for
unit
in
[
''
,
'K'
,
'M'
,
'G'
,
'T'
,
'P'
,
'E'
,
'Z'
]:
if
abs
(
num
)
<
1024.0
:
return
"%3.1f %s%s"
%
(
num
,
unit
,
suffix
)
num
/=
1024.0
return
"%.1f %s%s"
%
(
num
,
'Yi'
,
suffix
)
data_set_list
=
[]
if
data_set_reference
:
try
:
data_set
=
portal
.
data_set_module
.
get
(
data_set_reference
)
if
data_set
is
None
or
portal
.
ERP5Site_checkReferenceInvalidated
(
data_set
):
return
"Not found: there is no valid dataset for that reference"
data_set_list
.
append
(
data_set
)
except
Exception
as
e
:
# fails because unauthorized access
return
"ERROR: "
+
str
(
e
)
else
:
data_set_list
=
portal_catalog
(
portal_type
=
"Data Set"
,
validation_state
=
'validated OR published'
)
total_size
=
0
for
data_set
in
data_set_list
:
print
"Data set "
+
data_set
.
getReference
()
nfiles
,
size
=
getDatasetInfo
(
data_set
)
total_size
+=
size
print
" #files: "
+
str
(
nfiles
)
print
" Size: "
+
format_size
(
size
)
print
if
len
(
data_set_list
)
>
1
:
print
print
"TOTAL SIZE: "
+
format_size
(
total_size
)
return
printed
bt5/erp5_wendelin_data_lake_ingestion/SkinTemplateItem/portal_skins/erp5_wendelin_data_lake/ERP5Site_checkIngestedData.xml
0 → 100644
View file @
484a14e9
<?xml version="1.0"?>
<ZopeData>
<record
id=
"1"
aka=
"AAAAAAAAAAE="
>
<pickle>
<global
name=
"PythonScript"
module=
"Products.PythonScripts.PythonScript"
/>
</pickle>
<pickle>
<dictionary>
<item>
<key>
<string>
Script_magic
</string>
</key>
<value>
<int>
3
</int>
</value>
</item>
<item>
<key>
<string>
_bind_names
</string>
</key>
<value>
<object>
<klass>
<global
name=
"NameAssignments"
module=
"Shared.DC.Scripts.Bindings"
/>
</klass>
<tuple/>
<state>
<dictionary>
<item>
<key>
<string>
_asgns
</string>
</key>
<value>
<dictionary>
<item>
<key>
<string>
name_container
</string>
</key>
<value>
<string>
container
</string>
</value>
</item>
<item>
<key>
<string>
name_context
</string>
</key>
<value>
<string>
context
</string>
</value>
</item>
<item>
<key>
<string>
name_m_self
</string>
</key>
<value>
<string>
script
</string>
</value>
</item>
<item>
<key>
<string>
name_subpath
</string>
</key>
<value>
<string>
traverse_subpath
</string>
</value>
</item>
</dictionary>
</value>
</item>
</dictionary>
</state>
</object>
</value>
</item>
<item>
<key>
<string>
_params
</string>
</key>
<value>
<string>
data_set_reference=None
</string>
</value>
</item>
<item>
<key>
<string>
id
</string>
</key>
<value>
<string>
ERP5Site_checkIngestedData
</string>
</value>
</item>
</dictionary>
</pickle>
</record>
</ZopeData>
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment