The situation is as follows, I need to provision two instances on EC2 with Terraform, one will serve as an Ansible control node and the other will be a managed node. I need the installation of Ansible and configuration of the SSH key on the control node to be automated (as requested by the professor) as well as the copy of the public key from the control node to the authorized_keys file of the managed node.
Provisioning appears to be successful, the SSH key is generated on the control node, then the public key is copied to a file called public_key.txt on my local machine, then this file is dumped to the authorized_keys file on the managed node. Everything seems to have been done correctly, however, when trying to establish SSH communication between the control node and the managed node, it does not work and communication is denied.
The configuration files are as follows:
vpc.tf
resource "aws_vpc" "vpc_virginia" {
cidr_block = var.virginia_cidr
tags = {
"Name" : "vpc_virginia"
}
}
resource "aws_subnet" "public_subnet" {
vpc_id = aws_vpc.vpc_virginia.id
cidr_block = var.subnets[0]
map_public_ip_on_launch = true
tags = {
"Name" = "public_subnet"
}
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.vpc_virginia.id
tags = {
Name = "igw vpc virginia"
}
}
resource "aws_route_table" "public_crt" {
vpc_id = aws_vpc.vpc_virginia.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = {
Name = "public crt"
}
}
resource "aws_route_table_association" "crta_subnet" {
subnet_id = aws_subnet.public_subnet.id
route_table_id = aws_route_table.public_crt.id
}
resource "aws_security_group" "sg_control_node" {
name = "public_instance-sg"
description = "Allow SSH inbound traffic and all egress traffic"
vpc_id = aws_vpc.vpc_virginia.id
ingress {
description = "SSH over Internet"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [var.sg_ingress_cidr]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "control-node_sg"
}
}
resource "aws_security_group" "sg_managed_node" {
name = "managed-node-sg"
description = "Allow SSH inbound traffic and all HTTP traffic"
vpc_id = aws_vpc.vpc_virginia.id
ingress {
description = "SSH over Internet"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = [var.sg_ingress_cidr]
}
ingress {
description = "HTTP traffic"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = [var.sg_ingress_cidr]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
tags = {
Name = "managed_node_sg"
}
}
I'm using a null_resource to copy the key from one instance to the other.
ec2.tf
resource "aws_instance" "ec2_control" {
ami = "ami-0c7217cdde317cfec"
instance_type = "t2.micro"
subnet_id = aws_subnet.public_subnet.id
key_name = data.aws_key_pair.key.key_name
vpc_security_group_ids = [aws_security_group.sg_control_node.id]
provisioner "remote-exec" {
inline = [
"sudo apt-get update -y",
"sudo apt-get install -y software-properties-common",
"sudo apt-add-repository --yes --update ppa:ansible/ansible",
"sudo apt-get install -y ansible",
"sudo adduser ansible --disabled-password --gecos ''",
"sudo ssh-keygen -t rsa -b 4096 -f /home/ubuntu/.ssh/id_rsa -N ''",
]
connection {
type = "ssh"
host = self.public_ip
user = "ubuntu"
private_key = file("keys/mykey.pem")
}
}
tags = {
Name = "control_node"
}
}
resource "aws_instance" "ec2_managed" {
ami = "ami-0c7217cdde317cfec"
instance_type = "t2.micro"
subnet_id = aws_subnet.public_subnet.id
key_name = data.aws_key_pair.key.key_name
vpc_security_group_ids = [aws_security_group.sg_managed_node.id]
tags = {
Name = "managed_node"
}
}
resource "null_resource" "copy_public_key" {
provisioner "local-exec" {
command = "scp -o StrictHostKeyChecking=no -i keys/mykey.pem ubuntu@${aws_instance.ec2_control.public_ip}:/home/ubuntu/.ssh/id_rsa.pub public_key.txt"
}
provisioner "remote-exec" {
inline = [
"echo '${file("public_key.txt")}' >> /home/ubuntu/.ssh/authorized_keys"
]
connection {
type = "ssh"
user = "ubuntu"
private_key = file("keys/mykey.pem")
host = aws_instance.ec2_managed.public_ip
}
}
depends_on = [aws_instance.ec2_control, aws_instance.ec2_managed]
}
On the control node, the .ssh folder and its contents have the appropriate permissions.
Control node
drwx------ 2 ubuntu ubuntu 4096 Feb 9 03:57 .
drwxr-x--- 4 ubuntu ubuntu 4096 Feb 9 03:56 ..
-rw------- 1 ubuntu ubuntu 387 Feb 9 03:56 authorized_keys
-rw------- 1 root root 3389 Feb 9 03:57 id_rsa
-rw-r--r-- 1 root root 745 Feb 9 03:57 id_rsa.pub
On the managed node the public key of the control node has been copied to the authorized_keys file and the permissions of the .ssh folder and the authorized_keys file are correct.
Managed node
drwx------ 2 ubuntu ubuntu 4096 Feb 9 03:56 .
drwxr-x--- 4 ubuntu ubuntu 4096 Feb 9 03:57 ..
-rw------- 1 ubuntu ubuntu 1133 Feb 9 03:58 authorized_keys
Result
The problem is that when trying to connect from the control node I get the following response.
Load key "/home/ubuntu/.ssh/id_rsa": Permission denied
[email protected]: Permission denied (publickey).
And if I try it with the private key the result is the same.
ssh -i ~/.ssh/id_rsa [email protected]
Load key "/home/ubuntu/.ssh/id_rsa": Permission denied
[email protected]: Permission denied (publickey).
Frankly I don't understand what's wrong. Is any configuration needed? Does anyone have any ideas?
Note: I am using private IPs to communicate between instances.
As anon-coward said, the problem is that it was generating the ssh key as root and not as the ubuntu user. The solution was to execute the command without sudo and indeed, it worked.