4

It's been a while since I asked this question. To simplify, I just want a lifecycle configuration in AWS SageMaker which can successfully install a private GitHub repo.


I'm trying to install a private github repo with a bash script. The script does the following:

  • makes sure there's an ssh agent active
  • adds the ssh key from a persistent portion of memory
  • attempts to install the github repo

This is all happening in a SageMaker AWS EC2 instance via a lifecycle configuration. The implementation looks something like this:

HOME=/home/ec2-user/
ENVPIP=$HOME/anaconda3/envs/tensorflow2_p36/bin/pip

eval "$(ssh-agent -s)"
ssh-add ${HOME}SageMaker/Setup/id_rsa

yes | $ENVPIP install git+ssh://git@github.com/...

Running this, I get the following error:

ERROR: Command errored out with exit status 128: git clone -q 'ssh://****@github.com/...' /tmp/pip-req-build-ysacff_l Check the logs for full command output.

Here's all the pertinent output from cloudwatch:

Agent pid 5146

Identity added: /home/ec2-user/SageMaker/Setup/id_rsa (/home/ec2user/SageMaker/Setup/id_rsa)

2020-09-07T17:11:00.605-04:00

Collecting git+ssh://****@github.com/********1/*****-*****Library
  Cloning ssh://****@github.com/********1/*****-*****Library to /tmp/pip-req-build-ysacff_l

2020-09-07T17:11:00.605-04:00

Copy
ERROR: Command errored out with exit status 128: git clone -q 'ssh://****@github.com/********1/*****-*****Library' /tmp/pip-req-build-ysacff_l Check the logs for full command output.

looking into it, this seems like an issue with the cloning protocol, but I couldn't find anything pertinent to ssh.


P.s.

  • running the same few lines in the terminal works
  • I sanity checked the url to the repo, went right to it, so I don't think its a problem with anything after the ...

Updates:

  • tried updating git with yum install git. Apparently my version is up to date, so doing this resulted in the same error.
  • I commented out the pip install so that the EC2 Instance would start up successfully, then ran curl http://www.google.com, which resulted in a bunch of html. So it appears, at least after the EC2 instance boots, outbound traffic is allowed.
  • running curl http://www.google.com within the bash script (lifecycle configuration, with the problematic code commented out) results in the same html output, and the instance started up perfectly. this leads me to believe that there is, indeed, outbound traffic allowed on instance startup
  • a lot of people have viewed this question, and no one has answered it. I'm not married to the specific way I'm trying to install the repo, so if there are any working alternatives I'll gladly take them.
  • Is it possible that I'm encountering a race condition with some other system? this is happening close to when the instance starts. Are their any way to check that all dependent systems are running?
  • while doing some other stuff, in console I got the same error. I reinitialized the ssh agent, added the key, and it worked. I wonder if it's a race condition between eval "$(ssh-agent -s)" and yes | $ENVPIP install git+ssh://git@github.com/...?
Warlax56
  • 1,170
  • 5
  • 30
  • Show the logs please. – Qumber Sep 08 '20 at 05:00
  • Also, can your instance send outbound traffic? Maybe try `curl http://example.com`? – Qumber Sep 08 '20 at 05:09
  • I added the cloudwatch output, but I'm not sure how to `Check the logs for full command output.` on an EC2 instance that failed to start. Currently working on curling, I'll add it to the updates – Warlax56 Sep 08 '20 at 16:20
  • What is the version of git on the EC2 instance ? – Constantin Konstantinidis Sep 13 '20 at 08:48
  • Public or private repo? If public, why not try the `https://` version of the repo instead of the `ssh://` one? If private, debug with `ssh -vvv -T git@github.org` – Tom Hale Sep 13 '20 at 15:37
  • It's a private repo. How would I go about debugging w/ `ssh -vvv -T git@github.org`? – Warlax56 Sep 13 '20 at 21:53
  • You're looking for a message that you've successfully authenticated. If it doesn't say that (and instead says, "Permission denied (publickey)") then you have a key problem. You can include the relevant portion in your question. – bk2204 Sep 13 '20 at 23:30
  • There is possibility that when git invoked SSH, since this is a brand new instance, you got hit by the SSH warning "The authenticity of host 'github.com (IP ADDRESS)' can't be established." The `-q` probably silenced this. You may have to manually add github's host key to `~/.ssh/known_hosts` (or whatever it will resolve to in the EC2 instance) – pepoluan Nov 21 '22 at 12:03

0 Answers0