File Filter (legacy)#

class onetl.core.file_filter.file_filter.FileFilter(*, glob: str | None = None, regexp: Pattern | None = None, exclude_dirs: List[RemotePath] = None)#

Filter files or directories by their path.

Deprecated since version 0.8.0: Use Glob, Regexp or ExcludeDir instead.

Parameters:
globstr | None, default None

Pattern (e.g. *.csv) for which any file (only file) path should match

Warning

Mutually exclusive with regexp

regexpstr | re.Pattern | None, default None

Regular expression (e.g. \d+\.csv) for which any file (only file) path should match.

If input is a string, regular expression will be compiles using re.IGNORECASE and re.DOTALL flags

Warning

Mutually exclusive with glob

exclude_dirslist[os.PathLike | str], default []

List of directories which should not be a part of a file or directory path

Examples

Create exclude_dir filter:

from onetl.core import FileFilter

file_filter = FileFilter(exclude_dirs=["/export/news_parse/exclude_dir"])

Create glob filter:

from onetl.core import FileFilter

file_filter = FileFilter(glob="*.csv")

Create regexp filter:

from onetl.core import FileFilter

file_filter = FileFilter(regexp=r"\d+\.csv")

# or

import re

file_filter = FileFilter(regexp=re.compile("\d+\.csv"))

Not allowed:

from onetl.core import FileFilter

FileFilter()  # will raise ValueError, at least one argument should be passed
match(path: PathProtocol) bool#

False means it does not match the template by which you want to receive files