Skip to content

Test project on Windows with MINGW git (conda2.7&3.4/cpy-3.5)#519

Merged
Byron merged 42 commits intogitpython-developers:masterfrom
ankostis:appveyor
Oct 1, 2016
Merged

Test project on Windows with MINGW git (conda2.7&3.4/cpy-3.5)#519
Byron merged 42 commits intogitpython-developers:masterfrom
ankostis:appveyor

Conversation

@ankostis
Copy link
Contributor

@ankostis ankostis commented Sep 25, 2016

  • Use Appveyor integration servers;
  • Copy part of the travis logic, to speed-up test-time (no flake8, site & coverage), the only extra is wheel-packaging;
  • Limit combination of Anaconda, CPython and MINGW/Cygwin in only 6 variations.
  • Replaced poll/select buffer-assembly code with pump-threads (Win TCs faling 90-->20!) and rework unicode handling in many places.
  • Popen() procs it with subprocess.CREATE_NEW_PROCESS_GROUP so that they can be killed.
  • Invoke MINGW git-daemon (instead of git daemon); otherwise it detaches (unix-ism here) from parent git cmd, and then CANNOT DIE!
  • Stop reading directly Popen().proc.stdout/stderr from the launching-thread, it freezes due to GIL/buffering.
    • Unslurp diff consumption code (but not raw-patch) to handle it line-by line to support arbitrary big output.
  • Ensure read-only files do delete on Windows.
    +Rework GitCommandError.str() and add TCs on unicode handling.
  • Fixed TCs on config-reader/writer:
    • Modify lock/read-config-file code to ensure files closed.
    • Use with GitConfigarser() more systematically in TCs.
    • Clean up any locks left hanging from prev TCs.
    • Util: mark lock-files as SHORT_LIVED; save some SSDs...
  • Fixed TestRepo.test_submodule_update():
    • submod: del .git file prior overwrite; Windows denied otherwise!
  • FIX TestRepo.test_untracked_files():
    • In the git add <file> case, PY2 fails when args are unicode.
      Must encode them with locale.getpreferredencoding() AND use SHELL.
  • cmd: add shell into execute() kwds, for overriding USE_SHELL per
    command.
  • repo: replace blocky communicate() in _clone() with thread-pumps (UNNECESSARY, see this later discussion).
  • test_repo.py: unittestize (almost all) assertions.
  • Fixed file-access problems by using with... construct (not complete).
    • Replace open --> with open for index (base and TC).
  • Enabled some TCs, more remain dormant (ie the Hook TC never runs on Linux).
    • test_index.py: Enabled dormant fake-symlink assertion.
  • Various encoding issues.
  • Debug-log all Popen() calls, to collect cmd usage.

Apveyor results

  • Py27: FAILED (SKIP=3, errors=1)
  • Py34: FAILED (SKIP=3, errors=3)
  • Py35 : FAILED (SKIP=3, errors=3)

@Byron
Copy link
Member

Byron commented Sep 25, 2016

@ankostis Totally awesome ! Even though I won't be of much help, I do hope that fixing the windows issues work out in the end. That would restore windows platform support and keep it, after all these years of supporting it just on a best-effort basis, which meant not to obviously use non-windows APIs.

@ankostis ankostis force-pushed the appveyor branch 2 times, most recently from 1774f98 to d773d40 Compare September 25, 2016 16:31
@ankostis
Copy link
Contributor Author

Yes, that's a start.
I wish that today this PR become "ready", but currently, it keeps on failing,
and adding gc.collect() in various places...

@ankostis ankostis force-pushed the appveyor branch 3 times, most recently from 5899c59 to d12fdca Compare September 25, 2016 22:01
@ankostis
Copy link
Contributor Author

@Byron the git.testtest_git:TestGit.test_handle_process_output() TC hangs almost on every combination under Windows.
Can you help unstuck it?

ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 25, 2016
+ Pump once, so when input larger than `mmap.PAGESIZE`, stream is not
emptied.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 25, 2016
+ Pump once, so when input larger than `mmap.PAGESIZE`, stream is not
emptied.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 25, 2016
+ Pump once, so when input larger than `mmap.PAGESIZE`, stream is not
emptied.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
+ Pump code reads only once streams per `_read_lines_from_fno()`
invocation, so when input larger than `mmap.PAGESIZE`, bytes are
forgotten in the stream.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
+ The code in `_read_lines_from_fno()` was reading the stream only once
per invocation, so when input was larger than `mmap.PAGESIZE`, bytes
were forgotten in the stream.
+ Also set deamon-threads.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
+ The code in `_read_lines_from_fno()` was reading the stream only once
per invocation, so when input was larger than `mmap.PAGESIZE`, bytes
were forgotten in the stream.
+ Replaced buffer-banging code with iterate-on-file-descriptors.
+ Also set deamon-threads.
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
+ The code in `_read_lines_from_fno()` was reading the stream only once
per invocation, so when input was larger than `mmap.PAGESIZE`, bytes
were forgotten in the stream.
+ Replaced buffer-building code with iterate-on-file-descriptors.
+ Also set deamon-threads.
@ankostis
Copy link
Contributor Author

ankostis commented Sep 26, 2016

@Byron that was a tough one!
The pump code in cmd:handle_process_output() was crafted for the select/poll and was reading the stream only once per _read_lines_from_fno() invocation; so when input was larger than mmap.PAGESIZE, bytes were forgotten in the stream, and the child-process hang.

I replaced the buffer-assembly code with iterate-on-file-descriptors. Actually the code is so simple that it maybe worth to replace the select-poll with the 2-pumps code, for all OSs?

Many more bugs remain in Windows, but no hangs.
[edit]Actually with this bug fixed, the failed TCs on Windows dropped from ~90-->20!!

A side question:
In cmd.handle_process_output() you write:

NOTE: Just joining threads can possibly fail as there is a gap between .start() and when it's
actually started, which could make the wait() call to just return because the thread is not yet
active

Is that possible? I would expect any language allowing such "gap" to be considered outright broken.
But I couldnt google anything relevant?

ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
ankostis added a commit to ankostis/GitPython that referenced this pull request Sep 26, 2016
+ Extract util-method to delete lock-files, also on Windows (will be
needed by TCs).
+ Modify lock/read-config-file code to ensure files closed.
+ Use `with GitConfigarser()` more systematically in TCs.
+ Clear any locks left hanging from prev Tcs.
+ Util: mark lock-files as SHORT_LIVED; save some SSDs...
@Byron
Copy link
Member

Byron commented Oct 1, 2016

@ankostis Incredible work you are doing here ! Me not replying most of the time is merely an indication of me not touching the keyboard after work, so replies have to wait for the weekend. That said, I will try to address as much as possible, and in case I forget anything, please feel free to point me to it one more time.

Is that possible? I would expect any language allowing such "gap" to be considered outright broken.
But I couldnt google anything relevant?

I believe that is just me being paranoid, and considering that the internet doesn't know this issue, it probably just doesn't exist. Maybe you could delete that comment, while you are at it.

I replaced the buffer-assembly code with iterate-on-file-descriptors. Actually the code is so simple that it maybe worth to replace the select-poll with the 2-pumps code, for all OSs?

Without having had a look at the code, I may state that it must be possible to fetch data from stdout and stderr concurrently, if both channels are open. Otherwise you might see that blocking behaviour, which has bitten me in the past. If that would indeed be the case, I'd be fine with it. However, the only ways I could think of either was select/poll or threading.
Edit: Now I have had a look at that part of the code and saw it's using threads. So the above prerequisite is met. My only objection against using it for linux as well is that added overhead for spinning up 2 threads - as the select/poll code seems to work by now, I'd probably just leave it.

Do I have the permission to clean them up as I bump into them?
(the extensive TCs should catch any mishaps)

Yes, please go ahead ! Everything that makes the code more idiomatic is appreciated !

[edit] Actually discovered a needless WinError try..catch, already converted as GitCommandNotFound by execute(). Simplify _call_process(), separate without(!) win-code path and without make_call() nested-function.

Thanks for figuring this out ! As always, much appreciated :) !

Besides, I believe he "stream-pump" code runs slightly faster and it is a lot simpler.

Referring back to the select/poll vs Threading argument I made: Considering the added maintenance effort for having two versions of that, along with the relative rarity of doing a pull/push in the average use case, one might indeed choose to prefer using the threaded pumps instead. I will trust your expertise entirely with this one, so please feel free to remove the alternative code-path as you see fit.

@ankostis You have been relentlessly working on getting this fixed, and I hope to soon have it all green on AppVeyor. It will hardly be possible to give you all the credit you deserve, but if you are interested, I would gladly make you a contributor. This could also include becoming a maintainer on Pypi to allow you pushing builds yourself. Just let me know, and I will make it happen.

@ankostis
Copy link
Contributor Author

ankostis commented Oct 1, 2016

Thank you for your offer to become contributor, I'm honored. Unfortunately I cannot promise that I will be contributing to the project in the future; usually we are always late for deadlines :-(

Regarding the thread-pump, the situation is not satisfying, both alternatives (threading, select/poll) suffer from a race condition if the python code is not absorbing data fast enough from all pipes. Threading is not enough because of how PIPE for `stdin/stderr interact with the GIL and the buffers available. In order not to rely on python's buffers, we better use Queues.
Having scanted through Queue's code, I think I remember some internal, non-GIL, threads maintaining the queue, and that guarantees that they do not block when some buffer-capacity is reached.

But that can wait.

@Byron
Copy link
Member

Byron commented Oct 1, 2016

That's good enough ! I'd be happy already if you can have an eye out on the windows side of things, as you seem to have larger stakes in it than most others.
Even though you seem to be very close to getting all-greens on appveyor, if you at some point stop having time to finish it, please feel free to merge the branch so that the existing work doesn't deteriorate over time.

PS: You have been invited to become a contributor/maintainer.

+ FIX TestRepo.test_submodule_update():
  + submod: del `.git` file prior overwrite; Windows denied otherwise!
+ FIX TestRepo.test_untracked_files():
  + In the `git add <file>` case, it failed with unicode args on PY2.
Had to
encode them with `locale.getpreferredencoding()` AND use SHELL.
+ cmd: add `shell` into `execute()` kwds, for overriding USE_SHELL per
command.
+ repo: replace blocky `communicate()` in `_clone()` with thread-pumps.
+ test_repo.py: unittestize (almost  all) assertions.
+ Replace open --> with open for index (base and TC).
+ test_index.py: Enabled a dormant assertion.
@ankostis
Copy link
Contributor Author

ankostis commented Oct 1, 2016

Reporting progress so far; strangely, almost all of the problems fixed below did not appear in Appveyor:

  • Fixed TCs on config-reader/writer:
    • Modify lock/read-config-file code to ensure files closed.
    • Use with GitConfigarser() more systematically in TCs.
    • Clean up any locks left hanging from prev TCs.
    • Util: mark lock-files as SHORT_LIVED; save some SSDs...
  • Fixed TestRepo.test_submodule_update():
    • submod: del .git file prior overwrite; Windows denied otherwise!
  • FIX TestRepo.test_untracked_files():
    • In the git add <file> case, PY2 fails when args are unicode.
      Must encode them with locale.getpreferredencoding() AND use SHELL.

Generic changes

  • cmd: add shell into execute() kwds, for overriding USE_SHELL per
    command.
  • repo: replace blocky communicate() in _clone() with thread-pumps.
  • test_repo.py: unittestize (almost all) assertions.
  • Fixed file-access problems by using with... construct (not complete).
    • Replace open --> with open for index (base and TC).
  • Enabled some TCs, more remain dormant (ie the Hook TC never runs on Linux).
    • test_index.py: Enabled dormant fake-symlink assertion.
  • Various encoding issues.
  • Debug-log all Popen() calls, to collect cmd usage.

Apveyor results

  • Py27: FAILED (SKIP=3, errors=1)
  • Py34: FAILED (SKIP=3, errors=3)
  • Py35 : FAILED (SKIP=3, errors=3)

Site Regression

Unfortunately, building site started to fail on Travis - but has nothing to do with my changes.
Happens also for older jobs, if re-started. The error is this:

$ if [ "$TRAVIS_PYTHON_VERSION" == '3.5' ]; then cd doc && make html; fi
mkdir -p build/html build/doctrees
sphinx-build -b html -d build/doctrees   source build/html
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.5.2/bin/sphinx-build", line 11, in <module>
    sys.exit(main())
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/__init__.py", line 51, in main
    sys.exit(build_main(argv))
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/__init__.py", line 61, in build_main
    from sphinx import cmdline
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/cmdline.py", line 23, in <module>
    from sphinx.application import Sphinx
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/application.py", line 38, in <module>
    from sphinx.environment import BuildEnvironment
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/environment.py", line 38, in <module>
    from sphinx.io import SphinxStandaloneReader, SphinxDummyWriter, SphinxFileInput
  File "/home/travis/virtualenv/python3.5.2/lib/python3.5/site-packages/sphinx/io.py", line 16, in <module>
    from sphinx.transforms import ApplySourceWorkaround, ExtraTranslatableNodes, Locale, \
ImportError: cannot import name 'Locale'
make: *** [html] Error 1

@Byron
Copy link
Member

Byron commented Oct 1, 2016

Here is a patch that resolves the merge conflict. In case you didn't fix it already, please feel free to apply it.

Details
commit 5aba04fbb615310c6cab38abe901f5147f0caec1
Merge: 4592785 3959556
Author: Sebastian Thiel <byronimo@gmail.com>
Date:   Sat Oct 1 17:29:18 2016 +0200

    Merge branch 'appveyor' of https://github.com/ankostis/GitPython into ankostis-appveyor

diff --cc setup.py
index 35b1115,2e8ee52..ff34885
--- a/setup.py
+++ b/setup.py
@@@ -70,24 -68,11 +70,27 @@@ def _stamp_version(filename)
          print("WARNING: Couldn't find version line in file %s" % filename, file=sys.stderr)

  install_requires = ['gitdb >= 0.6.4']
 +extras_require = {
 +    ':python_version == "2.6"': ['ordereddict'],
 +}
 +
 +try:
 +    if 'bdist_wheel' not in sys.argv:
 +        for key, value in extras_require.items():
 +            if key.startswith(':') and pkg_resources.evaluate_marker(key[1:]):
 +                install_requires.extend(value)
 +except Exception:
 +    logging.getLogger(__name__).exception(
 +        'Something went wrong calculating platform specific dependencies, so '
 +        "you're getting them all!"
 +    )
 +    for key, value in extras_require.items():
 +        if key.startswith(':'):
 +            install_requires.extend(value)
- # end
++            
+ test_requires = ['node', 'ddt']
+ if sys.version_info[:2] < (2, 7):
 -    install_requires.append('ordereddict')
+     test_requires.append('mock')
 -# end

  setup(
      name="GitPython",

@Byron
Copy link
Member

Byron commented Oct 1, 2016

@ankostis By the way, what do you think about an intermediate merge ? You could start out freshly on a new PR then for the last meters, while greatly reducing the risk of merge-conflicts.

@ankostis
Copy link
Contributor Author

ankostis commented Oct 1, 2016

Merged it.

@Byron practically there are 3 error remaining, and I believe all 3 of them are related to gitdb:

  • test_random_access (git.test.performance.test_odb.TestObjDBPerformance):

    File "c:\python34-x64\lib\site-packages\smmap\util.py", line 192, in _read_into_memory
        mf += d
    nose.proxy.TypeError: Can't convert 'bytes' object to str implicitly
    
  • test_file_handle_leaks (git.test.test_repo.TestRepo):
    Same error as above.

  • test_base_rw (git.test.test_submodule.TestSubmodule):

    Caused by smmap (pobably) holding the file asked to be removed.

    File "C:\projects\gitpython\git\util.py", line 72, in onerror
      func(path)  # Will scream if still not possible to delete.
    nose.proxy.PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\tmpe8w_iksmnon_bare_test_base_rw\\git\\ext\\gitdb\\gitdb\\ext\\smmap'
    

Can you provide any hint on how to resolve them?

+ Some cases had restructuring of code.
@Byron
Copy link
Member

Byron commented Oct 1, 2016

@ankostis I agree, the errors you mention seem to be rooted in smmap and would thus have to be fixed there. Especially the one with the incompatible type appears like something that is relatively easy reproduce and fix via smmap.

A hint I have not right now, just because it was years ago that I have touched the project for the last time. However, I could believe it is riddled with resource-related issues as I was banking on deterministic destructors back then.

For now I will just merge the PR and try to setup the appveyor hooks. Maybe as a contributor, you can even do that yourself in case it doesn't work for me.

@ankostis
Copy link
Contributor Author

ankostis commented Oct 1, 2016

Just pushed one last commit with those 3 TCs appropriately skipped.
-- ALL GREEN now --

Especially the one with the incompatible type appears like something that is relatively easy reproduce and fix via smmap.

Actually quite the contrary: I can only reproduce it when i run the full harness; if i launch these 2 TCs alone, they pass! So practically, I cannot enter debug-mode - I have to wait 2' just to reach this point.

PS: You have been invited to become a contributor/maintainer.

Have not received an invitation yet.

@Byron
Copy link
Member

Byron commented Oct 1, 2016

This is odd:
screen shot 2016-10-01 at 20 54 24

I have just re-invited you, maybe it comes through. Otherwise you would just have to visit https://github.com/gitpython-developers to see the invitation.

Your last commit was not part of my merge, I will fetch it separately.

@Byron
Copy link
Member

Byron commented Oct 1, 2016

@ankostis I think it worked ! The last commit has been cherry-picked and AppVeyor is on its way to all-green ! Even though it's not the real thing just yet, I call it an impressive piece of work! Thanks so much !

Right now I wouldn't know how to tackle the smmap issue, just as it's hard to fix anything in a project that is installed as a dependency. Probably it would be easiest if it would be reproducible locally, which I can't even do.
Just so you know, I will probably go to read-only mode for the next days due to my travels, but should be back with replies by the coming weekend.

@ankostis
Copy link
Contributor Author

ankostis commented Oct 1, 2016

@Byron many of the fixes in this PR can be applied also on gitdb, and as you said (haven't checked) on smmp. Is there a chance to tackle these issues in these projects? Or are they "frozen" somehow?

I mean, a lot of problems just go away if resource classes are retrofitted as with .... context resources - but this means API forward compatibility is broken for future clients (existing code should work with the modified libs though).

What do you think, is it worth the effort (but cna't promise anything)?

@ankostis
Copy link
Contributor Author

ankostis commented Oct 2, 2016

For reference, reporting the CORRECT Appveyor results after #521 fixes by yarikoptic & ankostis restored skipped TCs:

PY27: 5 errors:
PY34: 4 errors:
PY35: 7 errors:
conda35: 7 errors:

@Byron
Copy link
Member

Byron commented Oct 22, 2016

@ankostis With the latest upgrade of gitdb and smmap to 2.0, it's finally possible again to make new releases. This would allow the changes you talk about to be retro-fitted indeed, and given all the subtle problems GitPython has, it would certainly be worth doing that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants