File Filter (legacy)#
- class onetl.core.file_filter.file_filter.FileFilter(*, glob: str | None = None, regexp: Pattern | None = None, exclude_dirs: List[RemotePath] = None)#
Filter files or directories by their path.
Deprecated since version 0.8.0: Use
Glob
,Regexp
orExcludeDir
instead.- Parameters:
- globstr | None, default
None
Pattern (e.g.
*.csv
) for which any file (only file) path should matchWarning
Mutually exclusive with
regexp
- regexpstr | re.Pattern | None, default
None
Regular expression (e.g.
\d+\.csv
) for which any file (only file) path should match.If input is a string, regular expression will be compiles using
re.IGNORECASE
andre.DOTALL
flagsWarning
Mutually exclusive with
glob
- exclude_dirslist[os.PathLike | str], default
[]
List of directories which should not be a part of a file or directory path
- globstr | None, default
Examples
Create exclude_dir filter:
from onetl.core import FileFilter file_filter = FileFilter(exclude_dirs=["/export/news_parse/exclude_dir"])
Create glob filter:
from onetl.core import FileFilter file_filter = FileFilter(glob="*.csv")
Create regexp filter:
from onetl.core import FileFilter file_filter = FileFilter(regexp=r"\d+\.csv") # or import re file_filter = FileFilter(regexp=re.compile("\d+\.csv"))
Not allowed:
from onetl.core import FileFilter FileFilter() # will raise ValueError, at least one argument should be passed
- match(path: PathProtocol) bool #
False means it does not match the template by which you want to receive files