无法在Terraform EC2实例上达到百胜存储库

发布于 2025-02-11 18:08:50 字数 7329 浏览 1 评论 0原文

背景：

我使用Terraform构建了一个AWS自动化组，其中一些实例分布在可用性区域，并由负载平衡器链接。一切都正确地创建了，但是负载平衡器没有有效的目标，因为它们在端口80上没有听到。很好，我想。我将安装nginx并抛出基本配置。

预期的行为

实例应该能够达到yum存储库

实际行为

，我无法ping任何东西或运行任何包装管理器命令，得到以下错误

Could not retrieve mirrorlist https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list error was
12: Timeout on https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list: (28, 'Failed to connect to amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com port 443 after 2700 ms: Connection timed out')


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=<repoid> ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>
        or
            subscription-manager repos --disable=<repoid>

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn2-core/2/x86_64

故障排除步骤

，我对Terraform是新手，而且我仍然遇到问题User_data的自动配置，因此我可以进入实例。该实例是在具有自动提供的公共IP的公共子网中设置的。以下是安全组的代码。

resource "aws_security_group" "elb_webtrafic_sg" {
    name        = "elb-webtraffic-sg"
    description = "Allow inbound web trafic to load balancer"
    vpc_id      = aws_vpc.main_vpc.id
    ingress {
        description = "HTTPS trafic from vpc"
        from_port        = 443
        to_port          = 443
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    ingress {
        description = "HTTP trafic from vpc"
        from_port        = 80
        to_port          = 80
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    ingress {
        description = "allow SSH"
        from_port        = 22
        to_port          = 22
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    egress {
        description = "all traffic out"
        from_port        = 0
        to_port          = 0
        protocol         = "-1"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    tags        = {
        Name = "elb-webtraffic-sg"
    }
}

resource "aws_security_group" "instance_sg" {
    name        = "instance-sg"
    description = "Allow traffic from load balancer to instances"
    vpc_id      = aws_vpc.main_vpc.id
    ingress {
        description = "web traffic from load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 80
        to_port          = 80
        protocol         = "tcp"
    }
    ingress {
        description = "web traffic from load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 443
        to_port          = 443
        protocol         = "tcp"
    }
    ingress {
        description = "ssh traffic from anywhere"
        from_port        = 22
        to_port          = 22
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    egress {
        description = "all traffic to load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 0
        to_port          = 0
        protocol         = "-1"
    }
    tags        = {
        Name = "instance-sg"
    }
}

#this is a workaround for the cyclical security group id call
#I would like to figure out a way for this to destroy this first
#it currently takes longer to destroy than to set up
#terraform hangs because of the dependancy each SG has on each other, 
#but will eventually struggle down to this rule and delete it, clearing the deadlock
resource "aws_security_group_rule" "elb_egress_to_webservers" {
  security_group_id        = aws_security_group.elb_webtrafic_sg.id
  type                     = "egress"
  source_security_group_id = aws_security_group.instance_sg.id
  from_port                = 80
  to_port                  = 80
  protocol                 = "tcp"
}

resource "aws_security_group_rule" "elb_tls_egress_to_webservers" {
  security_group_id        = aws_security_group.elb_webtrafic_sg.id
  type                     = "egress"
  source_security_group_id = aws_security_group.instance_sg.id
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
}

由于我能够进入计算机，因此我尝试设置Web实例安全组，以允许从Internet到实例的直接连接。相同的错误：不能在Web地址外ping，yum命令上的错误。

我可以在每个子网中ping默认网关。 10.0.0.1，10.0.1.1，10.0.2.1。

这是我当前设置的路由配置。

resource "aws_vpc" "main_vpc" {
  cidr_block    = "10.0.0.0/16"
  tags          = {
    Name = "production-vpc"
  }
}

resource "aws_key_pair" "aws_key" {
  key_name = "Tanchwa_pc_aws"
  public_key = file(var.public_key_path)
}

#internet gateway
resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.main_vpc.id
  tags = {
    Name = "internet-gw"
  } 
}


resource "aws_route_table" "route_table" {
  vpc_id = aws_vpc.main_vpc.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gw.id
  }

  tags = {
    Name = "production-route-table"
  }
}


resource "aws_subnet" "public_us_east_2a" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.0.0/24"
  availability_zone = "us-east-2a"

  tags = {
    Name = "Public-Subnet us-east-2a"
  }
}

resource "aws_subnet" "public_us_east_2b" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.1.0/24"
  availability_zone = "us-east-2b"

  tags = {
    Name = "Public-Subnet us-east-2b"
  }
}

resource "aws_subnet" "public_us_east_2c" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.2.0/24"
  availability_zone = "us-east-2c"

  tags = {
    Name = "Public-Subnet us-east-2c"
  }
}


resource "aws_route_table_association" "a" {
    subnet_id = aws_subnet.public_us_east_2a.id
    route_table_id = aws_route_table.route_table.id
}

resource "aws_route_table_association" "b" {
    subnet_id = aws_subnet.public_us_east_2b.id
    route_table_id = aws_route_table.route_table.id
}

resource "aws_route_table_association" "c" {
    subnet_id = aws_subnet.public_us_east_2c.id
    route_table_id = aws_route_table.route_table.id
}

原文

background:

I used terraform to build an AWS autoscaling group with a few instances spread across availability zones and linked by a load balancer. Everything is created properly, but the load balancer has no valid targets because they're nothing listening on port 80.
Fine, I thought. I'll install NGINX and throw up a basic config.

expected behavior

instances should be able to reach yum repos

actual behavior

I'm unable to ping anything or run any of the package manager commands, getting the following error

Could not retrieve mirrorlist https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list error was
12: Timeout on https://amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com/2/core/latest/x86_64/mirror.list: (28, 'Failed to connect to amazonlinux-2-repos-us-east-2.s3.dualstack.us-east-2.amazonaws.com port 443 after 2700 ms: Connection timed out')


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo=<repoid> ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable <repoid>
        or
            subscription-manager repos --disable=<repoid>

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=<repoid>.skip_if_unavailable=true

Cannot find a valid baseurl for repo: amzn2-core/2/x86_64

troubleshooting steps taken

I'm new to Terraform, and I'm still having issues doing the automated provisioning of user_data, so I SSH'd into the instance. The instance is set up in a public subnet with an auto-provisioned public IP. Below is the code for the security groups.

resource "aws_security_group" "elb_webtrafic_sg" {
    name        = "elb-webtraffic-sg"
    description = "Allow inbound web trafic to load balancer"
    vpc_id      = aws_vpc.main_vpc.id
    ingress {
        description = "HTTPS trafic from vpc"
        from_port        = 443
        to_port          = 443
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    ingress {
        description = "HTTP trafic from vpc"
        from_port        = 80
        to_port          = 80
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    ingress {
        description = "allow SSH"
        from_port        = 22
        to_port          = 22
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    egress {
        description = "all traffic out"
        from_port        = 0
        to_port          = 0
        protocol         = "-1"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    tags        = {
        Name = "elb-webtraffic-sg"
    }
}

resource "aws_security_group" "instance_sg" {
    name        = "instance-sg"
    description = "Allow traffic from load balancer to instances"
    vpc_id      = aws_vpc.main_vpc.id
    ingress {
        description = "web traffic from load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 80
        to_port          = 80
        protocol         = "tcp"
    }
    ingress {
        description = "web traffic from load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 443
        to_port          = 443
        protocol         = "tcp"
    }
    ingress {
        description = "ssh traffic from anywhere"
        from_port        = 22
        to_port          = 22
        protocol         = "tcp"
        cidr_blocks      = ["0.0.0.0/0"]
    }
    egress {
        description = "all traffic to load balancer"
        security_groups  = [ aws_security_group.elb_webtrafic_sg.id ]
        from_port        = 0
        to_port          = 0
        protocol         = "-1"
    }
    tags        = {
        Name = "instance-sg"
    }
}

#this is a workaround for the cyclical security group id call
#I would like to figure out a way for this to destroy this first
#it currently takes longer to destroy than to set up
#terraform hangs because of the dependancy each SG has on each other, 
#but will eventually struggle down to this rule and delete it, clearing the deadlock
resource "aws_security_group_rule" "elb_egress_to_webservers" {
  security_group_id        = aws_security_group.elb_webtrafic_sg.id
  type                     = "egress"
  source_security_group_id = aws_security_group.instance_sg.id
  from_port                = 80
  to_port                  = 80
  protocol                 = "tcp"
}

resource "aws_security_group_rule" "elb_tls_egress_to_webservers" {
  security_group_id        = aws_security_group.elb_webtrafic_sg.id
  type                     = "egress"
  source_security_group_id = aws_security_group.instance_sg.id
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
}

Since I was able to SSH into the machine, I tried to set up the web instance security group to allow direct connection from the internet to the instance. Same errors: cannot ping outside web addresses, same error on YUM commands.

I can ping the default gateway in each subnet. 10.0.0.1, 10.0.1.1, 10.0.2.1.

Here is the routing configuration I currently have setup.

resource "aws_vpc" "main_vpc" {
  cidr_block    = "10.0.0.0/16"
  tags          = {
    Name = "production-vpc"
  }
}

resource "aws_key_pair" "aws_key" {
  key_name = "Tanchwa_pc_aws"
  public_key = file(var.public_key_path)
}

#internet gateway
resource "aws_internet_gateway" "gw" {
  vpc_id = aws_vpc.main_vpc.id
  tags = {
    Name = "internet-gw"
  } 
}


resource "aws_route_table" "route_table" {
  vpc_id = aws_vpc.main_vpc.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.gw.id
  }

  tags = {
    Name = "production-route-table"
  }
}


resource "aws_subnet" "public_us_east_2a" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.0.0/24"
  availability_zone = "us-east-2a"

  tags = {
    Name = "Public-Subnet us-east-2a"
  }
}

resource "aws_subnet" "public_us_east_2b" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.1.0/24"
  availability_zone = "us-east-2b"

  tags = {
    Name = "Public-Subnet us-east-2b"
  }
}

resource "aws_subnet" "public_us_east_2c" {
  vpc_id     = aws_vpc.main_vpc.id
  cidr_block = "10.0.2.0/24"
  availability_zone = "us-east-2c"

  tags = {
    Name = "Public-Subnet us-east-2c"
  }
}


resource "aws_route_table_association" "a" {
    subnet_id = aws_subnet.public_us_east_2a.id
    route_table_id = aws_route_table.route_table.id
}

resource "aws_route_table_association" "b" {
    subnet_id = aws_subnet.public_us_east_2b.id
    route_table_id = aws_route_table.route_table.id
}

resource "aws_route_table_association" "c" {
    subnet_id = aws_subnet.public_us_east_2c.id
    route_table_id = aws_route_table.route_table.id
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

咋地 2025-02-18 18:08:50

本文需要aws_nat_gateway。

以同样的方式，将依赖关系添加到AWS_INTERMET_GATEWAWEW上对我的情况正常工作。

EC2.TF：

resource "aws_instance" "your_ec2" {
    :
  depends_on = [
    aws_internet_gateway.gw
  ]

  user_data = <<EOF
    #!/bin/bash
    yum update -y
    yum install -y jq
    EOF
    :
}

This article (in Japanese) says adding dependency from aws_instance to aws_nat_gateway was required.

In the same way, adding dependency to aws_internet_gateway worked fine for my case.

ec2.tf:

resource "aws_instance" "your_ec2" {
    :
  depends_on = [
    aws_internet_gateway.gw
  ]

  user_data = <<EOF
    #!/bin/bash
    yum update -y
    yum install -y jq
    EOF
    :
}

回复收藏 0 原文

~没有更多了~