Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corpus module #13

Merged
merged 6 commits into from
Aug 1, 2019
Merged

Corpus module #13

merged 6 commits into from
Aug 1, 2019

Conversation

Olamyy
Copy link
Contributor

@Olamyy Olamyy commented Jul 15, 2019

Introduces a number of changes:

  1. Restructured the file structure by:

     1. Moving some functions from `adr.py` to `utils.py`. 
     These function seem more like generally reusable utilities than actual adr utilities. 
    
     2. Moving another set of functions from `adr.py` to `preprocessing.py`.
      These function seem more like generally reusable preprocessing functions than actual adr functions. 
    
  2. Converted existing tests to unittest. Major win here is that the tests are now in reusable OOP format.

  3. Introduces some of the enhancements in Corpus Loading Features #12 :

    1. corpus.py : Introduces the Corpus and DirectoryCorpus classes to help users load corpus into iranlowo.
    2. loaders.py : Introduces a bunch of classes for loading text already available in https://github.com/Niger-Volta-LTI/yoruba-text into the Corpus/DirectoryCorpus classes. The currently available loaders are: BBC, Bibeli, Owe, Yoruba Blog. [note: I'm not exactly proud of my implementation of the OweLoader so I won't mind someone taking another look at it before merging.]

@Olamyy
Copy link
Contributor Author

Olamyy commented Jul 15, 2019

Failing tests. I would check this and fix them as soon as I can.

@ruohoruotsi
Copy link
Member

Hi @Olamyy ! Thank you very much for your PR and overall contributions!! These look like useful additions to the library! Overall, I think it might be more collaborative to discuss, design & review proposed changes before we get to the PR phase (where we're just battling Travis issues), especially to ensure that we are synchronized on goals & to get opinions like from David, Timi and others. What do you think of this approach?

Regarding the build failure, it is perhaps because you need to add an __init__.py at the top level of the tests directory, otherwise, without it, tests.utils is not a package like you are using it as such. See: https://docs.pytest.org/en/latest/goodpractices.html .

Screen Shot 2019-07-18 at 12 42 07 PM

@ruohoruotsi ruohoruotsi self-assigned this Jul 18, 2019
@ruohoruotsi ruohoruotsi added the help wanted Extra attention is needed label Jul 18, 2019
@Olamyy
Copy link
Contributor Author

Olamyy commented Jul 18, 2019

Yeah. That definitely seems like a better and more organized approach to working on features. Which of github or slack do you think would be a good medium for this?

About the failing tests, I came about this same link while trying to figure out what the issue was. I just haven't looked at it in details yet. I'll get to it as soon as I can.

Thanks.

@ruohoruotsi
Copy link
Member

Great! Let me make a Project board and some milestones and you can add some cards, so basically github. This way the work can stay localized to the project, even if we have a higher reliance on github.

Maybe I'll take a fork of your fork and see if the __init__.py solution will work.

@ruohoruotsi
Copy link
Member

The new project is here: https://github.com/Niger-Volta-LTI/iranlowo/projects

Copy link
Member

@ruohoruotsi ruohoruotsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay, will try it out once changes are merged and tweak anything. The only question I have is whether you ran the Black code formatter? https://black.readthedocs.io/en/stable/

I've been using this and liking it a lot, so have been using it to to keep my codes presentable. I should probably figure out how to add it to the linting system within Travis ... 🤔

@ruohoruotsi ruohoruotsi merged commit 0046b61 into Niger-Volta-LTI:master Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants