Commit History
Merge pull request #92 from seanpedrick-case/regex_search
c2d2ccd
unverified
Sean Pedrick-Case
commited on
Added regex search feature for multi-word text search
21318d3
Minor update to cli_redact for new local OCR model options. Updated app_settings.qmd, user_guide.qmd, and readme.md with descriptions of new features
d5b5291
Fixed minor bugs related to Textract API calls, pyproject format. Removed print statements and fixed some future concat deprecation issues
7bb945f
formatter and linter applied
ca530a1
Merge pull request #89 from seanpedrick-case/textract_type_name_output
00011db
unverified
Sean Pedrick-Case
commited on
Added suffix to textract output files according to tasks included (e.g. signature analysis). Improved reporting when Textract client doesn't exist. Fixed display for cost and time taken. Changes to config variables to allow exclusion of PaddleOCR from display
25e2089
Improved paddle and hybrid OCR analysis across all options. Tried to revise requirements for spaces
2c00d05
Added paddle to pre-requirements.txt
01c8eb6
Allowed for load Paddle at startup. Updated requirements for torch compatability
bf83b6f
Updated requirements
1935c45
Updated requirements for torch. Updated main hf flow to force changes to spaces repo
e59fbb7
Updated dependencies, github to HF workflow
059a5f7
Updated sync to hf workflow for zero GPU space sync
27ed5c8
Updated readme for install instructions with paddle, vlms
c3ccad4
Merge pull request #88 from seanpedrick-case/vlm_support
cd01917
unverified
Sean Pedrick-Case
commited on
formatter and linter applied
bcb5ad4
Updated word segmenter code
4440bed
Similar cleanup to requirements_lightweight.txt
ef8c72e
Cleaned requirements.txt file
40bd54b
Updated test suites to use the lightweight version of requirements.txt
f5146c7
Added text rotation capability
1ff0b3d
Optimised VLM model selection
54a5789
Optimised VLM model choice and prompting/parameters
ad60619
Added hybrid paddle + vlm option. Optimised word segmenters for single words. Optimised package installation in pyproject.toml
6d4f6e4
Added upgraded line to word parsing algorithm. Added dependencies and framework for Huggingface spaces deployment with ZeroGPU
c2becd8
Improved new requirements. Improved visual OCR outputs and word-level Paddle outputs and general bounding box positioning
e4493fe
Initial commit for VLM support. Created visualisations for OCR output. Corrected log_file_output_paths reference.
5e01004
Again revised spaCy language model load for different languages
2f34683
Modified model load for custom languages with spaCy. Languages should load successfully now.
2148ddd
User ownership folder change to whole user folder in Dockerfile. Minor changes to documentation
bf7b066
Ensured that AWS credentials called correctly in logger settings.
43c7a6d
Updated user guide and app settings. Updated some additional lambda_entrypoint arguments. Ensured that examples are correctly displayed on GUI.
c543ba0
head attribute added to Gradio blocks context to enable enforcement of direct vs relative file paths. Updates to direct mode/lambda entrypoint to ensure as many options as possible can be user defined
febacad
Merge pull request #80 from seanpedrick-case/main
41e7358
unverified
Sean Pedrick-Case
commited on
Fix condition check for SHOW_EXAMPLES
57de024
unverified
Sean Pedrick-Case
commited on
Merge pull request #79 from seanpedrick-case/dev
b0dca2c
unverified
Sean Pedrick-Case
commited on