Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
E
embulk-input-filename
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 0
    • Issues 0
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Commits
  • Issue Boards
  • Klaus Wölfel
  • embulk-input-filename
  • Merge Requests
  • !3

The source project of this merge request has been removed.
Closed
Opened Aug 04, 2017 by Ghost User@ghost1
  • Report abuse
Report abuse

FilenameInputPlugin for multi directory run in multi threads.

  1. new configuration(compulsory): multi_dir example: multi_dir: ["sample/sample_","../example/example_"]

  2. new configuration(not necessary, default value is []): multi_tag example: multi_tag: ["tag1","tag2"] if the length of multi_tag is less than length of multi_dir, it will add blank string automatically.

  3. Since we can set the directory in the multi_dir, the path_prefix is deprecated.

  4. new configuration(not necessary, default value is 0) order_by_modified_time example: order_by_modified_time: 1 if the value is 1, sort the file in each directory in modified time ascend order. if the value is not 1 neither 0, sort the file in descend order. if the value is the default value 0. sort the file in alphabetical order.

  5. new configuration(not necessary, default value is 0) order_by_creation_time example: order_by_creation_time: 1 Be careful: the unix systems do not recorder the creation time of the files only the windows do that.

  6. new configuration(not necessary, default value: 1024102410) chunk_size example: chunk_size: 100000

This plugin will load the files in different directories. each directory will be dealt as a task and run in a thread. we should run the tasks in multi threads to speed up the upload. The file order is kept in each directory.

In this plugin, it will create the pages for each file so that the none-bin parser is deprecated now.

For the output Plugin. I rewrite the Ruby to java. The configuration is the same as the ruby plugin.

  • Discussion 0
  • Changes 30
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
0
Labels
None
Assign labels
  • View project labels
Reference: klaus/embulk-input-filename!3
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7